Models and Heuristics for Robust Resource Allocation in Parallel and Distributed Computing Systems

Similar documents
ROBUST RESOURCE ALLOCATION IN HETEROGENEOUS PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS INTRODUCTION

Robust static allocation of resources for independent tasks under makespan and dollar cost constraints

TODAY S computing systems are often heterogeneous

Dynamic Resource Allocation Heuristics for Maximizing Robustness with an Overall Makespan Constraint in an Uncertain Environment

A Multidimensional Robust Greedy Algorithm for Resource Path Finding in Large-Scale Distributed Networks

Measuring the Robustness of Resource Allocations in a Stochastic Dynamic Environment

Stochastic robustness metric and its use for static resource allocations

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Processor Allocation for Tasks that is Robust Against Errors in Computation Time Estimates

Kyle M. Tarplee 1, Ryan Friese 1, Anthony A. Maciejewski 1, H.J. Siegel 1,2. Department of Electrical and Computer Engineering 2

A STUDY OF BNP PARALLEL TASK SCHEDULING ALGORITHMS METRIC S FOR DISTRIBUTED DATABASE SYSTEM Manik Sharma 1, Dr. Gurdev Singh 2 and Harsimran Kaur 3

Curriculum Vitae Shoukat Ali

Study of an Iterative Technique to Minimize Completion Times of Non-Makespan Machines

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Coordinated carrier aggregation for campus of home base stations

Mapping Heuristics in Heterogeneous Computing

The 2-core of a Non-homogeneous Hypergraph

Survey of Evolutionary and Probabilistic Approaches for Source Term Estimation!

Note Set 4: Finite Mixture Models and the EM Algorithm

Sensor Tasking and Control

Tasks Scheduling using Ant Colony Optimization

Searching with Partial Information

A Course on Meta-Heuristic Search Methods for Combinatorial Optimization Problems

Varianzbasierte Robustheitsoptimierung

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

Towards the robustness of high-performance execution of multiscale numerical simulation codes hosted by the Cyberinfrastructure of MSU

A Genetic Algorithm for Multiprocessor Task Scheduling

Lecture on Modeling Tools for Clustering & Regression

Sensitivity and optimization methods applied to the dynamic fuel cycle

Summer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis

K-Means Clustering. Sargur Srihari

ROBUST OPTIMIZATION. Lecturer : Majid Rafiee

J. Parallel Distrib. Comput.

Information-Driven Dynamic Sensor Collaboration for Tracking Applications

Probabilistic Worst-Case Response-Time Analysis for the Controller Area Network

Assignment 1 Introduction to Graph Theory CO342

THE objective of the redundancy allocation problem is to

Prediction-Based Admission Control for IaaS Clouds with Multiple Service Classes

Toward the joint design of electronic and optical layer protection

Clustering. Chapter 10 in Introduction to statistical learning

Advanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm)

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Slides From Lecture Notes for Chapter 8. Introduction to Data Mining

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

SCHEDULING WORKFLOWS WITH BUDGET CONSTRAINTS

PUBLICATIONS. Journal Papers

Towards Green Cloud Computing: Demand Allocation and Pricing Policies for Cloud Service Brokerage Chenxi Qiu

Relative Significance of Trajectory Prediction Errors on an Automated Separation Assurance Algorithm

Multi-Disciplinary Design of an Aircraft Landing Gear with Altair HyperWorks

A Survey of Current Directions in Service Placement in Mobile Ad-hoc Networks

Improving Dual Bound for Stochastic MILP Models Using Sensitivity Analysis

Clustering CS 550: Machine Learning

Simulation: Solving Dynamic Models ABE 5646 Week 12, Spring 2009

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures

Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems

CONSTRUCTION AND EVALUATION OF MESHES BASED ON SHORTEST PATH TREE VS. STEINER TREE FOR MULTICAST ROUTING IN MOBILE AD HOC NETWORKS

Budget Transfers. To initiate a budget transfer, go to FGAJVCM (Journal Voucher Mass Entry). FGAJVCM

Automated Clustering-Based Workload Characterization

Parametric. Practices. Patrick Cunningham. CAE Associates Inc. and ANSYS Inc. Proprietary 2012 CAE Associates Inc. and ANSYS Inc. All rights reserved.

5 (Monday, 08/14/2017)

Job Shop Scheduling Problem (JSSP) Genetic Algorithms Critical Block and DG distance Neighbourhood Search

Workflow Scheduling Using Heuristics Based Ant Colony Optimization

Queueing Theoretic Approach to Job Assignment Strategy Considering Various Inter-arrival of Job in Fog Computing

Optimal Routing and Scheduling in Multihop Wireless Renewable Energy Networks

Curriculum Vitae Shoukat Ali

Rollout Algorithms for Stochastic Scheduling Problems

Hardware-Software Codesign

Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

QoS Guided Min-Mean Task Scheduling Algorithm for Scheduling Dr.G.K.Kamalam

Recursive column generation for the Tactical Berth Allocation Problem

Comparison and analysis of eight scheduling heuristics for the optimization of energy consumption and makespan in large-scale distributed systems

Modeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA

On Cluster Resource Allocation for Multiple Parallel Task Graphs

Context based optimal shape coding

Energy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks

Unsupervised Learning : Clustering

Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen

ABSTRACT I. INTRODUCTION

Introduction to Approximation Algorithms

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

Algorithm Design Methods. Some Methods Not Covered

RELIABILITY IN CLOUD COMPUTING SYSTEMS: SESSION 1

Scalable Computing: Practice and Experience Volume 8, Number 3, pp

Layered Multicast with Forward Error Correction (FEC) for Internet Video

Selecting DaoOpt solver configuration for MPE problems

Evaluation and comparison of gene clustering methods in microarray analysis

Scheduling. Job Shop Scheduling. Example JSP. JSP (cont.)

A Fuzzy Based Approach for Priority Allocation in Wireless Sensor Networks Imen Bouazzi #1, Jamila Bhar #2, Semia Barouni *3, Mohamed Atri #4

A novel method for identification of critical points in flow sheet synthesis under uncertainty

Clustering. Introduction to Data Science University of Colorado Boulder SLIDES ADAPTED FROM LAUREN HANNAH

k-center Problems Joey Durham Graphs, Combinatorics and Convex Optimization Reading Group Summer 2008

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Scheduling in Highly Uncertain Environments

A Two-Stage Stochastic Programming Approach for Location-Allocation Models in Uncertain Environments

Introduction to Stochastic Combinatorial Optimization

Unsupervised Learning: Clustering

Quality of Service Routing

A New Call Admission Control scheme for Real-time traffic in Wireless Networks

Routing and Traffic Engineering in Hybrid RF/FSO Networks*

Transcription:

Models and Heuristics for Robust Resource Allocation in Parallel and Distributed Computing Systems D. Janovy, J. Smith, H. J. Siegel, and A. A. Maciejewski Colorado State University Outline n models and metrics for robust resource allocation n deterministic robustness n stochastic robustness n current research

2 Simple Example: Cluster Computing n cluster of M heterogeneous machines n T independent tasks 5production environment n given estimated time to compute task i on machine j 5for each task on each machine n need to allocate resources to tasks 5static, off-line allocation n makespan: time to complete all tasks 5minimize estimated makespan n want to be robust with respect to uncertainty of execution time estimates 5actual makespan 1.2 estimated makespan 51.2 is a user specified constraint total execution time estimated makespan mach A task 1 task 3 mach B task 2

Deterministic Robust Resource Allocation Measuring the Robustness of a Resource Allocation, IEEE Transactions on Parallel and Distributed Systems, July 2004 The Three Robustness Questions n what behavior of the system makes it robust? 5ex. actual makespan 1.2 estimated makespan n what uncertainties is the system robust against? 5ex. variations in actual execution times n quantitatively, exactly how robust is the system? 5ex. largest collective increase (Euclidian measure) in actual task completion times before a makespan constraint violation occurs 3

FePIA procedure for cluster computing example n performance Features 5ex. F j : finishing time of machine j for an allocation n Perturbation parameter 5ex. vector of actual task computation times for given allocation n Impact of perturbation parameter on performance features 5ex. F j = sum of computation times of tasks on machine j n Analysis 5ex. robustness radius for machine j 4 Cluster Computing Example: Robustness Metric gsmallest collective increase between estimated and actual computation times that will cause F j to be > 1.2 estimated makespan 5robustness metric = minimum robustness radius total execution time 1.2 est. makespan finish time task 2 task 1 mach j

Results of FePIA Analysis Step for Machine j perturbation parameter = t = [t 1 t 2 ] t 1 estimated values t 2 nsmallest change in t from its estimated value 5that causes violation of specified constraint 5robustness radius r µ for resource allocation µ plot for resource allocation µ where {t F j (t) = 1.2 x estimated makespan} nrobustness metric: minimum robustness radii over all machines 5

Cluster: Robustness Versus Estimated Makespan n 1000 random resource allocations for 20 tasks and 5 machines n normalized robustness = robust metric / estimated makespan each point represents one resource allocation 6 estimated

7 Deterministic Robustness: Static Mapping Robust Static Allocation of Resources for Independent Tasks under Makespan and Dollar Cost Constraints, Journal of Parallel and Distributed Computing, accepted, to appear n robust resource allocations with uncertain task execution times n two problem variations were considered 5fixed machine suite in a production environment g task execution times based on estimates g maximize robustness within a given makespan constraint 5machine selection with a purchasing cost constraint g machines vary in performance and cost g select a suite of machines that maximizes robustness while meeting makespan constraints n six heuristics were designed and evaluated

8 Deterministic Robustness: Dynamic Mapping Dynamic Resource Allocation Heuristics that Manage Tradeoff between Makespan and Robustness, Journal of Supercomputing, accepted, to appear n establish the notion of deterministic dynamic robustness n task arrival times are not known in advance n fixed suite of machines n robust resource allocations with uncertain task execution times n two problem variations were considered 5minimize makespan while maintaining a specified level of robustness over all mapping events 5maximize the minimum robustness over time within a given makespan constraint n ten heuristics were designed and evaluated

9 Stochastic Robustness Model: Cluster Example A Stochastic Approach to Measuring the Robustness of Resource Allocations in Distributed Systems, 2006 International Conference on Parallel Processing, Aug. 2006 n stochastic robustness metric is probability that completion time of all machines meets requirements probability density function task 3 on A est. makespan (mean) task 3 on B makespan constraint time total execution time probability of exceeding makespan time task 1 mach A task 3 makespan constraint task 2 mach B

Stochastic Robustness: Mapping Example n periodic data sets received from sensors n changes in input data cause variability in execution time n data needs to be processed before next data set arrives n goal: minimize the period Λ, between data sets to allow more sets to be processed Λ a a 11 n 1 1 a1m a n M M 10

Stochastic Robustness: Greedy & Iterative Static Mapping Greedy Approaches to Static Stochastic Robust Resource Allocation for Periodic Sensor Driven Distributed Systems, 2006 International Conference on Parallel and Distributed Processing Techniques and Applications, June 2006 5ex. two-phase min-min type approach Iterative Algorithms for Stochastically Robust Static Resource Allocation in Periodic Sensor Driven Clusters, 18th IASTED International Conference on Parallel and Distributed Computing and Systems, Nov. 2006 5iterative static heuristics (versus greedy) look at entire resource allocation during each iteration 5ex. genetic algorithm type approach 11

Stochastic Robustness: Dynamic Mapping Measuring the Robustness of Resource Allocations in a Stochastic Dynamic Environment, 21st International Parallel and Distributed Processing Symposium, Mar. 2007, Session 26, Thursday ~ 4:30 n establish the notion of stochastic dynamic robustness n environment consisted of a heterogeneous, distributed computing system designed for a high volume web site n tasks arrival times are not known in advance n task execution times described by probability mass functions n dynamic stochastic robustness metric is defined as the average over all mapping events of the instantaneous stochastic robustness metric values n used to validate the dynamic robustness as a predictor of performance 12

13 Current Robustness Research n design of robust resource allocation in distributed systems under random machine failures using stochastic data 5reallocation of resources is allowed 5redundant task assignment to ensure high priority task complete when resource reallocation is not allowed n resource allocations for virtual world environments 5where the number of users is uncertain 5response time is robust to the number of users being added to the system 5also applies to calculations being distributed across a P2P network n how sensitive are resource allocation algorithms to errors in the models of uncertainty n use of dynamic stochastic robustness measure to guide resource allocations in a dynamic environment