Performance Modeling of a Cluster of Workstations
|
|
- Erik Preston
- 6 years ago
- Views:
Transcription
1 Performance Modeling of a Cluster of Workstations Ahmed M. Mohamed, Lester Lipsky and Reda A. Ammar Dept. of Computer Science and Engineering University of Connecticut Storrs, CT 6269 Abstract Using off-the-shelf commodity workstations to build a cluster for parallel computing has become a common practice. In studying or designing a cluster of workstations one should have available a robust analytical model that includes the major parameters that determines the cluster performance. In this paper, we present such a model for evaluating a cluster s performance. The model covers the effect of storage limitations, interconnection networks and the impact of data partitioning. The model can be used to estimate the throughput of the cluster or the expected service time of the tasks under any specific configuration. It also, can detect the bottlenecks in the system, which can lead to more effective utilization of the available resources. The model (Multi-Class Jackson Network) we use can be considered as the base line for the cluster architecture analysis because it models the system behavior without using any special task or scheduling algorithms. Key words: Cluster Computing; ueueing Analysis; Performance Modeling; Jackson Network.. Introduction The development and use of cluster based computing is increasingly becoming the effective approach for solving high performance computing problems. The new trend of moving away from specialized traditional supercomputing platforms such as Cray / SGI T3E to cheap and general purpose systems consisting of loosely coupled components is expected to continue. The cluster approach gives the users flexibility in constructing, upgrading, and scaling a parallel system for a given budget, which is suitable for a large class of applications and workloads. Clearly there is a strong need and role for the integration of performance analysis in the design of clusters and cluster based applications. However, the role of performance analysis has always lagged behind the structural and management aspects of software engineering. In the literature, there are four major approaches that can be used in performance analysis: analytical modeling, measurement, simulation and statistical prediction. Analytic models [5,4,2,23,] depend on the construction of symbolic expressions for different performance metrics. Once the expression is available, suitable mathematical functions can be applied for performing almost any kind of analysis. This approach is suitable for analyzing existing systems and future systems. The second-approach, statistical prediction [8,], predicts execution time using past observations. Statistical methods have the advantage that they do not need any direct knowledge of the internal design of the algorithm or the machine. In direct measurement, the basic concept is to collect performance data online during the execution of the problem. An obvious disadvantage is the requirement of availability of the target computing system and fully developed and tested implementations of every potentially different design. Several worthy solutions for this approach are presented in the literature [6,22]. The relaxation of the need for the target-computing platform is perhaps the most useful characteristic of the simulation approach [,2,7]. The simulation approach appears to be a good compromise between the accuracy of time consuming direct measurement and the mathematical approaches. However, the complexity linked to the construction of simulation models and the huge computing resources needed have often prevented the use of simulation.
2 In our analysis of cluster of workstations, we use the analytical modeling approach. In Section 2 we will give a brief background for some of the analytical models that have been developed for this environment. In Section 3 there will be a description of our performance model. We will present our results in Section Related Work The success of any performance model is dependent on how accurately the model matches the system and more importantly what insights does it provide for performance analysis. We ll try to address some of the performance models for the clusters in this section. Petri-Nets (PN) have been heavily used for performance modeling of parallel machines as in Marsan [5], Balbo [3] and Trivedi [9,2]. Benitez [4] has used it to develop a performance model to predict the execution time of a parallel application running on a heterogeneous cluster. The model is a mix of Petri net and continuous Markov chains. Benitez used the task graph model to represent the parallel application. Given that the execution times of the tasks are exponentially distributed, then the firing rates of the transitions in the PN are exponentially distributed. This makes the reachability graph equivalent to a continuous time Markov chains (CTMC). Although the above model captures some performance parameters, it does not address the performance bottlenecks in the system like communication contention and contention due to resources sharing. Zhang, Yan and Song [23] have developed a mixed model of simulation, measurements and analytical. They used task graphs to represent the parallel application. Communication contention has been estimated by simulation. The geometric distribution is used to model the non-dedication property. The use of the deterministic distribution and the claim that it is better than the exponential distribution in modeling the service time is highly questionable. The method used to model communication contention is not accurate due to using deterministic analysis for the shared communication channel in their simulation. Modeling the non-dedication property by a geometric distribution is satisfactory but is no more than the discrete version of the exponential distribution. Li and Antonio developed another probabilistic model in [2]. Individual task execution time distributions are assumed to be known. A probabilistic model for data transmission time is developed to model the network behavior. Three random variables are used to represent each task, start time, execution time and finish time. The analysis of this paper is general since it may be applied to any distribution. However it is hard to develop and it ignores the effect of contention. Berman in [4] introduced a model that calculates the slowdown imposed on applications in time-shared multi-user clusters. The model focuses on three kinds of slowdown: Local slowdown, Communication slowdown and Aggregate slowdown. The authors of this paper claimed that, based on their experiments, the time to execute an application in dedicated mode and the time to execute the same application under contention are directly proportional by a constant factor that they identified as the slowdown caused by contention for resources. The above model is based only on experiments so it cannot be applied in general. Also, it assumes no contention will occur in dedicated environments and ignores queueing delays. Varki [2] has developed a simple response time approximation for parallel systems with exponential service time distributions. Markov chains have been used as the analytical tool. The response time expression is derived for the system by employing alternate representations of the parallel system then equating the parameters of the alternate but equivalent representations. It is worth mentioning that the solution to this model is known analytically in order statistics. Pr (max < x) = [F(x)] n, where F(x) is probability distribution function for the tasks. In our analysis, we will focus on the architectural limitations. As we have discussed, other works that have been done in this area do not consider how these limitations can affect the performance of the system. In most of the work we have seen so far, CPU speed is the only factor that has been taken into account. We believe this is not enough. Contention in the communication links, contention in the shared disks, contention at the CPU and the way the shared data is distributed are all equally important. The model we discuss next
3 can model the effects of all of the above and predicts the performance of the running application. 3. The Performance Model In studying or designing a cluster of workstations one should have available a robust analytical model that includes the major parameters that determine the cluster performance. The major parameters we are modeling include communication contention, geometry configurations, time needed to access different resources and data distribution. More details can always be added to the basic model like scheduler overheads, multitasking, inhomogeneity, task dependencies etc. Such a model is useful to get a basic understanding of how the system performs. 3. Application Model In this section we describe how we model the target parallel application. The parallel application (or job) can be considered to be a set of N independent, but identically distributed (iid) tasks, {t, t 2... t N }, where each task is itself made up of a sequence of requests for CPU, local data, global or remote data. The number of tasks is presumed to be much greater than the number of workstations (WS) (or PC's) that make up the cluster. We assume that each WS is made up of a CPU and a disk drive. The tasks are queued up, and the first K tasks are assigned to the cluster. When a task is finished, it is immediately replaced by another task in the queue. The set of active tasks can communicate with each other by exchanging data from each other's disks. The tasks run in parallel, but they must queue for service when they wish to access the same device. Each task consists of a finite number of instructions, either I/O instructions (need local or remote disk access) or non-i/o instructions (CPU activity). Thus the execution of the task consists of phases of computation, then I/O then computation, etc, until finished. We assume that during an I/O phase the task cannot start a new computational phase (no CPU-I/O overlap). Assume that T is the random variable that represents the running time of a task if it is alone in the system The mean execution time E(T) for task t i can be divided into three parts not including the communication cost: E(T) = T + T 2 + T 3, Where, T is the expected time needed to execute non-i/o instructions locally (local CPU time). T 2 is the expected time needed to execute I/O instructions locally (local Disk time). T 3 is the expected time needed to execute I/O instructions (remote Disk time). We use the following parameters to represent the above components. = T + T 2. C is the fraction of local time that the task spends at the local CPU. T = C *. T 2 = ( C ) *. T 3 = Y. So we can write T as: E(T) = C * + ( C ) * + Y. All of the above parameters assume no contention. The performance model uses these parameters to calculate the effect of contention when more than one task is running in the cluster. For simplicity we will normalize the task expected execution time. Therefore, E(T) = and hence, Y = System Model In general a cluster can be considered as (either homogeneous or heterogeneous) a distributed computing platform. It can be dedicated, where the parallel program has full control over the whole cluster, or undedicated where the parallel program uses the full computational power of the nodes only when they are not used by a local owner. When a node is busy, there should be an agreement between the owner of the node and the cluster about the percentage of computational power that the node will provide to the cluster. The network model assumes the transmission time is modeled according to a probabilistic distribution (Exponential). Two different architectures are considered:. Centralized Storage. In this model there is a central storage node and all nodes contact this central node when they request global data. 2. Distributed Storage. Here, the required global data is distributed among all of the nodes.
4 3.3 Modeling The Cluster We believe that a system with such a configuration should always be analyzed first by a Network queueing model (Jackson network model). This is a very good way of identifying and organizing parameters and locating bottlenecks, even if the exponential assumptions and independence of tasks are not satisfied. If applied correctly, Jackson network models are known to be very reliable under very light load or very heavy load. General exponential queueing network models were first solved by Jackson[] and by Gordon and Newell[9]. Buzen[5,6] developed the idea of using the G function as an efficient tool for analyzing the performance. Muntz, Chandy and Basket [8] developed many of the basic notions concerning several job streams, as well as some notions concerning non-exponential holding times. Moore [7] introduced the use of generating functions for the treatment of network queueing models. Our analysis for the performance of the computing cluster will use the generating functions for multiple classes queueing networks. Each of the k active tasks resides in its own CPU. Thus we put each task in its own class. (e.g., task does not use CPU 2, 3 or 4. but uses the other disks as remote) 3.3. Example for a Central Cluster. In this model, all of the tasks go to central server asking for data, each task takes Y units of time to get its data if there is no contention. Each task will spend C* units of time in average at its local CPU and ( C)* units of time an average at its local disk. Each node will be charged B*Y units of time to use the shared communication link. The task resides in the central server does not go to the shared communication link when it needs to access its disk. E= C* Y+ (-C)* Y C * Y Y (-C)* C* (-C)* B*Y B*Y C* (-C)* B*Y The above matrix represents a four-node and four classes central cluster, E ij is the mean time task i needs on server j in case of no contention(if the task is alone in the cluster). Each row represents a task or a class. The columns represents the nodes, the first column is the CPU of the central server and the second column is its disk, then each two columns represent a node the first for the CPU and the second for the disk. The last column is the communication cost for each task. To model the cost of the scheduler overhead, we would use another parameter and charge that to the one or more CPU s. 3.4 Basic Assumptions. Exponential Distribution: We assume that the service time of each server (CPU or Disk) and the rate of requesting data (I/O) are exponentially distributed. The assumption of exponential service time is a venerable assumption in all branches of queueing theory. Since the queue length in any node does not become large, the exponential distribution will give a very good approximation even if the actual distribution is not exponential. 2. Time Sharing: We assume that the processor sharing property is applied in each server. Since we assume exponential service time, we can say our analysis is also valid if the queue discipline is FIFO. 3. Memory Limitation: We assume that each node has enough amount of memory to for its own local work. If the node does not have enough memory, an overhead for context switching would be added. This can be modeled too by introducing another parameter to be charged to the CPU. 4. Communication: We assume non-blocking send and blocking receive (you have to wait for the requested data before you can resume work). 3.5 The Algorithm Assume we have M nodes and K classes then, the generating function G for this model can be calculated from the following formula[6] (see e.g., [3] for algorithmic details): G M (N, N 2,..,N K ) = G M- (N, N 2,..,N K ) + M G M (N, N 2,.., N K ) + 2M G M (N, N 2,.., N K ) + + KM G M (N, N 2,N 3,..,N K ) The performance metric used here is the throughput, which can be calculated as following: i =G M (N, N 2,,..,N K )/G M (N, N 2,..,N K )
5 Where, i is the throughout of class i. Ni is the number of customers in class i. In our case it is equal one (one task per node). ij is the time spent by class i in server j. In our analysis we assumed that the tasks are queued up, and the first K tasks are assigned to the cluster (Ni = ). When a task is finished, it is immediately replaced by another task in the queue. If one desires to have more than one task to share node i, then change the value of Ni from to the number of tasks that share node i. 4. Analysis and Results In the following section we will study the performance behavior of both the distributed and central cluster under different configurations. We modeled both the central and the distributed clusters for three different sizes (5, 8 and ) WS s. In figures.a and.b we ignored the communication cost (B = ) to check how the contention at the shared disks may affect the performance of the cluster. In all figures we present, we use the average throughput of the cluster. The average throughput is the sum of the throughput of all nodes divided by the number of nodes in the cluster. It is clear from figure.a how the contention at the central disk affects the average throughput of the central cluster (it would equal one if no contention occurred). For example if we have an 8 node central cluster and we spend 25 % of the task time at the central disk, the average throughput of the cluster will be decreased by 5%. Meanwhile, the distributed cluster scales very well if we can ignore the communication cost. In other words the cluster size has small effect on the contention of the shared disks of the distributed cluster if we ignore the communication cost. We can use these graphs to estimate the throughput of the cluster or the expected service time of the tasks under any specific configuration. In figures 2.a and 2.b we added the communication cost and we assume that B =. We see in fig. 2.a how the contention at the communication channel can hurt the distributed cluster, now it is no better than the central cluster. There is a serious degradation in the average throughput of the cluster. In figure 2.c we see the probability of the communication channel to be busy vs. the amount of local work. As we increase the remote load(decrease the local work) the probability for the channel to be busy increases. For M = and, the channel saturates when the task spends 25% of its time remotely. In figure 2.d, for the central cluster, the probability for the central disk to be busy is always greater than the channel. So, the contention at the central disk is the bottleneck for the central cluster not the communication network. In figures 3.a and 3.b we increased the local disk load, while decrease the CPU load, but both systems have almost the same behavior. The reason is the bottleneck in each is still the same (communication for distributed cluster and central disk for central cluster) without any modification, so increasing the local disk load did not change the average throughput in both systems. Obviously these calculation are highly dependent upon the values of C, and B. In designing a real cluster these parameters must be estimated with some accuracy for the calculation to be applicable. 5. Conclusion The development and use of cluster based computing is increasingly becoming an effective approach for solving high performance computing problems. We believe that understanding the performance limitations of such environment will help in using it efficiently. In this paper, we introduced an analytical performance model that can predict the behavior of the cluster under different circumstances. The model is flexible and can be adapted to many platforms. We modeled the degradation in the performance that occur due to the contention in the communication channel and shared disks. We showed how these contentions can affect the performance of the cluster. The model can also predict the contention at the CPU or the local memory if needed with minor modifications. 6. REFERENCES [] D. Abramson, J. Giddy, Nimrod: Tool for Performing Simulation using Distributed Workstations, The 4 th HPDC, Aug [2] K. Aida, U. Nagashima, Overview of a Performance Evaluation System for Global Scheduling Algorithms,
6 Proc. of the 8th IEEE Inter. Symposium on High Performance Distributed Computing, pp. 97-4, 999. [3] G. Balbo, G. Serazzi, Asymptotic Analysis of Multiclass Closed ueueing Networks: Multiple Bottlenecks, Performance Evaluation, Vol. 3, pp. 5-52, 997. [4] F. Berman, S. M. Figueira, A Slowdown Model for Applications on Time-shared Clusters of Workstations, IEEE Transaction on Parallel and Distributed Systems, vol.2, pp , Jun 2. [5] J. P. Buzen, ueueing Network Models of Multiprogramming, Ph.D. Thesis, Div. Of Engr. and Physics, Harvard University, 97. [6] J. Buzen, Computational Algorithms for Closed ueueing, Comm. ACM, Vol 6, No. 9, Sep 973. [7] P. Dinda, Online Prediction of the Running Time of Tasks, th IEEE International Symposium on High Performance Distributed Computing, pp , 2 [8] I. Foster, W. Smith, V. Taylor, Predicting Application Run Times Using Historical Information, Proc. of IPPS/SPDP'98 Workshop, pp , 998. [9] W. J. Gordon, G. Newell, Closed ueueing Systems with Exponential Servers, JORSA, Vol. 5, pp ,!967. [] M. A. Iverson, F. Ozguner, L. C. Potter, Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment, IEEE Transactions on Computers, vol. 48, no. 2, pp Dec [] J. Jackson, Jopshop-Like ueueing Systems, J. TIMS, Vol., pp. 3-42, 963. [2] Y. Li, J. Antonio, Estimating the Execution Time Distribution for a Task Graph in a Heterogeneous Computing System, Proc. of the 6th HCW 997, pp , 997. [3] Lester Lipsky, J. D. Church, Applications of a ueueing Network Model for a Computer System, ACM Computing Surveys (CSUR), Vol. 9, Issue 3, Sep [4] N. Benitez, A. McSpadden, Stochastic Petri Nets Applied to the Performance Evaluation of Static Task allocations in Heterogeneous Computing Environments, Proceedings of the 6th Heterogeneous Computing Workshop, pp , 997. [5] M. Marsan, G.Balbo, G.Conte, A Class of Generalized Stochastic Petri Nets for the Performance Evaluation of Multiprocessor Systems'', ACM Transactions on Computer Systems, Vol.2, n.2,, pp.93-22, May 984 [6] B. Mohr, A. Malony, Speedy: An Integrated Performance Extrapolation Tool For pc++ Programs, Proc. of Joint Conference Performance Tools, pp , 995. [7] F. Moore, Computational Model of a Closed ueueing Network with Exponential Servers, IBM J. of Res. And Develop., pp , Nov [8] R. Muntz, F. Baskett, K. Chandy, Open, closed and Mixed Networks of ueues with Different Classes of Customers, JACM, Vol. 22, pp , Apr 975. [9] K. Trivedi, Oliver C. Ibe, Choi, Performance Evaluation of Client-Server Systems, IEEE Transaction on Parallel and Distributed Systems, Vol. 4, pp , Nov [2] K. S. Trivedi, A. Puliafito, M. Scarpa, Petri Nets with k-simultaneously Enabled Generally Distributed Timed Transitions, Performance Evaluation, Vol. 32, No., pp. -34, Feb. 998 [2] E. Varki, Response Time Analysis of Parallel Computer and Storage Systems, IEEE Transaction on Parallel and Distributed Systems, vol. 2, no., pp. 46-6, Nov. 2. [22] R. Wolski, N. T. Spring, J. Hayes, The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, J. Future Generation Computer Systems, vol.5, no.5-6, pp , 999. [23] Y. Yan,. Zhang, Y. Song, An Effective Performance Prediction Model for Parallel Computing on Non-dedicated Heterogeneous Networks of Workstations, J. of Parallel and Distributed Computing, vol.38, no., pp. 63-8, 996.
7 C = M = C = M5 M8 M Distributed Cluster Central Cluster Fig.a Fig..b C =.9, B = M = C =.9, B = M = Distributed Cluster Central Cluster Fig 2.a Fig. 2.b P C =.9, B = M = P C =.9, B = Fig 2.c Fig. 2.d. M = D M = Ch D Ch D Ch C =.5, B = M = C =.5, B = Distributed Cluster Central Cluster Fig 3.a Fig. 3.b M =
Modeling and Analysis
Performance Based Cluster Architecture: Analytical Modeling and Analysis Ahmed Mostafa Abdel-Rahman Mohamed, Ph.D. University of Connecticut, 24. Abstract Fundamental to the development and use of parallel
More informationQueuing Networks. Renato Lo Cigno. Simulation and Performance Evaluation Queuing Networks - Renato Lo Cigno 1
Queuing Networks Renato Lo Cigno Simulation and Performance Evaluation 2014-15 Queuing Networks - Renato Lo Cigno 1 Moving between Queues Queuing Networks - Renato Lo Cigno - Interconnecting Queues 2 Moving
More informationDetermining the Number of CPUs for Query Processing
Determining the Number of CPUs for Query Processing Fatemah Panahi Elizabeth Soechting CS747 Advanced Computer Systems Analysis Techniques The University of Wisconsin-Madison fatemeh@cs.wisc.edu, eas@cs.wisc.edu
More informationDDSS: Dynamic Dedicated Servers Scheduling for Multi Priority Level Classes in Cloud Computing
DDSS: Dynamic Dedicated Servers Scheduling for Multi Priority Level Classes in Cloud Computing Husnu Saner Narman Md. Shohrab Hossain Mohammed Atiquzzaman School of Computer Science University of Oklahoma,
More informationQueueing Networks 32-1
Queueing Networks Raj Jain Washington University in Saint Louis Jain@eecs.berkeley.edu or Jain@wustl.edu A Mini-Course offered at UC Berkeley, Sept-Oct 2012 These slides and audio/video recordings are
More informationAnalytic Performance Models for Bounded Queueing Systems
Analytic Performance Models for Bounded Queueing Systems Praveen Krishnamurthy Roger D. Chamberlain Praveen Krishnamurthy and Roger D. Chamberlain, Analytic Performance Models for Bounded Queueing Systems,
More informationPerformance Extrapolation for Load Testing Results of Mixture of Applications
Performance Extrapolation for Load Testing Results of Mixture of Applications Subhasri Duttagupta, Manoj Nambiar Tata Innovation Labs, Performance Engineering Research Center Tata Consulting Services Mumbai,
More informationLecture 5: Performance Analysis I
CS 6323 : Modeling and Inference Lecture 5: Performance Analysis I Prof. Gregory Provan Department of Computer Science University College Cork Slides: Based on M. Yin (Performability Analysis) Overview
More informationProfile-Based Load Balancing for Heterogeneous Clusters *
Profile-Based Load Balancing for Heterogeneous Clusters * M. Banikazemi, S. Prabhu, J. Sampathkumar, D. K. Panda, T. W. Page and P. Sadayappan Dept. of Computer and Information Science The Ohio State University
More informationIndex. ADEPT (tool for modelling proposed systerns),
Index A, see Arrivals Abstraction in modelling, 20-22, 217 Accumulated time in system ( w), 42 Accuracy of models, 14, 16, see also Separable models, robustness Active customer (memory constrained system),
More informationAirside Congestion. Airside Congestion
Airside Congestion Amedeo R. Odoni T. Wilson Professor Aeronautics and Astronautics Civil and Environmental Engineering Massachusetts Institute of Technology Objectives Airside Congestion _ Introduce fundamental
More informationEnhanced Round Robin Technique with Variant Time Quantum for Task Scheduling In Grid Computing
International Journal of Emerging Trends in Science and Technology IC Value: 76.89 (Index Copernicus) Impact Factor: 4.219 DOI: https://dx.doi.org/10.18535/ijetst/v4i9.23 Enhanced Round Robin Technique
More informationExample: CPU-bound process that would run for 100 quanta continuously 1, 2, 4, 8, 16, 32, 64 (only 37 required for last run) Needs only 7 swaps
Interactive Scheduling Algorithms Continued o Priority Scheduling Introduction Round-robin assumes all processes are equal often not the case Assign a priority to each process, and always choose the process
More informationChapter 14 Performance and Processor Design
Chapter 14 Performance and Processor Design Outline 14.1 Introduction 14.2 Important Trends Affecting Performance Issues 14.3 Why Performance Monitoring and Evaluation are Needed 14.4 Performance Measures
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.
More informationQueuing Systems. 1 Lecturer: Hawraa Sh. Modeling & Simulation- Lecture -4-21/10/2012
Queuing Systems Queuing theory establishes a powerful tool in modeling and performance analysis of many complex systems, such as computer networks, telecommunication systems, call centers, manufacturing
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected
More informationLatency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures
Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures Shashank S. Nemawarkar and Guang R. Gao School of Computer Science McGill University, Montreal, Quebec H3A 2A7 Canada
More informationE ALLOCATION IN ATM BASED PRIVATE WAN
APPLICATION OF INT TEGRATED MODELING TECHNIQ QUE FOR DATA SERVICES E F. I. Onah 1, C. I Ani 2,, * Nigerian Journal of Technology (NIJOTECH) Vol. 33. No. 1. January 2014, pp. 72-77 Copyright Faculty of
More informationOptimization of Multi-server Configuration for Profit Maximization using M/M/m Queuing Model
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-2, Issue-8 E-ISSN: 2347-2693 Optimization of Multi-server Configuration for Profit Maximization using M/M/m
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationPredicting Slowdown for Networked Workstations
Predicting Slowdown for Networked Workstations Silvia M. Figueira* and Francine Berman** Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9293-114 {silvia,berman}@cs.ucsd.edu
More informationAdvanced Topics UNIT 2 PERFORMANCE EVALUATIONS
Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors
More informationFree upgrade of computer power with Java, web-base technology and parallel computing
Free upgrade of computer power with Java, web-base technology and parallel computing Alfred Loo\ Y.K. Choi * and Chris Bloor* *Lingnan University, Hong Kong *City University of Hong Kong, Hong Kong ^University
More informationStretch-Optimal Scheduling for On-Demand Data Broadcasts
Stretch-Optimal Scheduling for On-Demand Data roadcasts Yiqiong Wu and Guohong Cao Department of Computer Science & Engineering The Pennsylvania State University, University Park, PA 6 E-mail: fywu,gcaog@cse.psu.edu
More informationBlock diagram overview of PASM.
ANALYSIS OF THE PASM CONTROL SYSTEM MEMORY HIERARCHY David Lee Tuomenoksa Howard Jay Siegel Purdue University School of Electrical Engineering West Lafayette, IN 47907 Abstract - Many proposed large-scale
More informationShared-Memory Multiprocessor Systems Hierarchical Task Queue
UNIVERSITY OF LUGANO Advanced Learning and Research Institute -ALaRI PROJECT COURSE: PERFORMANCE EVALUATION Shared-Memory Multiprocessor Systems Hierarchical Task Queue Mentor: Giuseppe Serazzi Candidates:
More informationExtrapolation Tool for Load Testing Results
Extrapolation Tool for Load Testing Results Subhasri Duttagupta, Rajesh Mansharamani Performance Engineering Lab Tata Consulting Services Mumbai, India subhasri.duttagupta@tcs.com, rajesh.mansharamani@tcs.com
More informationModelling traffic congestion using queuing networks
Sādhanā Vol. 35, Part 4, August 2010, pp. 427 431. Indian Academy of Sciences Modelling traffic congestion using queuing networks TUSHAR RAHEJA Mechanical Engineering Department, Indian Institute of Technology
More informationImproving VoD System Efficiency with Multicast and Caching
Improving VoD System Efficiency with Multicast and Caching Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 1. Introduction 2. Previous Works 3. UVoD
More informationA Data-Aware Resource Broker for Data Grids
A Data-Aware Resource Broker for Data Grids Huy Le, Paul Coddington, and Andrew L. Wendelborn School of Computer Science, University of Adelaide Adelaide, SA 5005, Australia {paulc,andrew}@cs.adelaide.edu.au
More informationCalculating Call Blocking and Utilization for Communication Satellites that Use Dynamic Resource Allocation
Calculating Call Blocking and Utilization for Communication Satellites that Use Dynamic Resource Allocation Leah Rosenbaum Mohit Agrawal Leah Birch Yacoub Kureh Nam Lee UCLA Institute for Pure and Applied
More informationApplication of QNA to analyze the Queueing Network Mobility Model of MANET
1 Application of QNA to analyze the Queueing Network Mobility Model of MANET Harsh Bhatia 200301208 Supervisor: Dr. R. B. Lenin Co-Supervisors: Prof. S. Srivastava Dr. V. Sunitha Evaluation Committee no:
More informationComparing Gang Scheduling with Dynamic Space Sharing on Symmetric Multiprocessors Using Automatic Self-Allocating Threads (ASAT)
Comparing Scheduling with Dynamic Space Sharing on Symmetric Multiprocessors Using Automatic Self-Allocating Threads (ASAT) Abstract Charles Severance Michigan State University East Lansing, Michigan,
More informationIntroduction to Modeling. Lecture Overview
Lecture Overview What is a Model? Uses of Modeling The Modeling Process Pose the Question Define the Abstractions Create the Model Analyze the Data Model Representations * Queuing Models * Petri Nets *
More informationProbabilistic Modeling of Leach Protocol and Computing Sensor Energy Consumption Rate in Sensor Networks
Probabilistic Modeling of Leach Protocol and Computing Sensor Energy Consumption Rate in Sensor Networks Dezhen Song CS Department, Texas A&M University Technical Report: TR 2005-2-2 Email: dzsong@cs.tamu.edu
More informationA New Call Admission Control scheme for Real-time traffic in Wireless Networks
A New Call Admission Control scheme for Real-time traffic in Wireless Networks Maneesh Tewari and H.S. Jamadagni Center for Electronics Design and Technology, Indian Institute of Science, Bangalore, 5612
More informationA Capacity Planning Methodology for Distributed E-Commerce Applications
A Capacity Planning Methodology for Distributed E-Commerce Applications I. Introduction Most of today s e-commerce environments are based on distributed, multi-tiered, component-based architectures. The
More informationQuantitative System Evaluation with Java Modeling Tools (Tutorial Paper)
Quantitative System Evaluation with Java Modeling Tools (Tutorial Paper) ABSTRACT Giuliano Casale Imperial College London Dept. of Computing London, SW7 2AZ, U.K. g.casale@imperial.ac.uk Java Modelling
More informationQoS-constrained List Scheduling Heuristics for Parallel Applications on Grids
16th Euromicro Conference on Parallel, Distributed and Network-Based Processing QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids Ranieri Baraglia, Renato Ferrini, Nicola Tonellotto
More informationEAI Endorsed Transactions on Industrial Networks And Intelligent Systems
EAI Endorsed Transactions on Industrial Networs And Intelligent Systems Research Article Coupling of the synchronization stations of an Extended Kanban system Leandros A. Maglaras, * University of Surrey,
More informationAN ANALYSIS OF TIME-SHARING COMPUTER SYSTEMS USING MARKOV MODELS*
AN ANALYSIS OF TIME-SHARING COMPUTER SYSTEMS USING MARKOV MODELS* J. L. Smith Systems Engineering Laboratory, The University of Michigan Ann Arbor, Michigan INTRODUCTION The development of RQA 1 (Recursive
More informationDiPerF: automated DIstributed PERformance testing Framework
DiPerF: automated DIstributed PERformance testing Framework Ioan Raicu, Catalin Dumitrescu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago Introduction
More informationIntroduction: Two motivating examples for the analytical approach
Introduction: Two motivating examples for the analytical approach Hongwei Zhang http://www.cs.wayne.edu/~hzhang Acknowledgement: this lecture is partially based on the slides of Dr. D. Manjunath Outline
More informationPRODUCTION SYSTEMS ENGINEERING:
PRODUCTION SYSTEMS ENGINEERING: Optimality through Improvability Semyon M. Meerkov EECS Department University of Michigan Ann Arbor, MI 48109-2122, USA YEQT-IV (Young European Queueing Theorists Workshop)
More informationChapter 6: CPU Scheduling. Operating System Concepts 9 th Edition
Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time
More informationVerification and Validation of X-Sim: A Trace-Based Simulator
http://www.cse.wustl.edu/~jain/cse567-06/ftp/xsim/index.html 1 of 11 Verification and Validation of X-Sim: A Trace-Based Simulator Saurabh Gayen, sg3@wustl.edu Abstract X-Sim is a trace-based simulator
More informationC. H. Sauer K. M. Chandy
C. H. Sauer K. M. Chandy Approximate Analysis of Central Server Models Abstract: Service time distributions at computer processing units are often nonexponential. Empirical studies show that different
More informationEpochs: Trace-Driven Analytical Modeling of Job Execution Times
Department of Computer Science George Mason University Technical Reports 4400 University Drive MS#4A5 Fairfax, VA 22030-4444 USA http://cs.gmu.edu/ 703-993-1530 Epochs: Trace-Driven Analytical Modeling
More informationSurvey on MapReduce Scheduling Algorithms
Survey on MapReduce Scheduling Algorithms Liya Thomas, Mtech Student, Department of CSE, SCTCE,TVM Syama R, Assistant Professor Department of CSE, SCTCE,TVM ABSTRACT MapReduce is a programming model used
More informationParallel Systems. Part 7: Evaluation of Computers and Programs. foils by Yang-Suk Kee, X. Sun, T. Fahringer
Parallel Systems Part 7: Evaluation of Computers and Programs foils by Yang-Suk Kee, X. Sun, T. Fahringer How To Evaluate Computers and Programs? Learning objectives: Predict performance of parallel programs
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture X: Parallel Databases Topics Motivation and Goals Architectures Data placement Query processing Load balancing
More informationTiming-Based Communication Refinement for CFSMs
Timing-Based Communication Refinement for CFSMs Heloise Hse and Irene Po {hwawen, ipo}@eecs.berkeley.edu EE249 Term Project Report December 10, 1998 Department of Electrical Engineering and Computer Sciences
More informationComparison of pre-backoff and post-backoff procedures for IEEE distributed coordination function
Comparison of pre-backoff and post-backoff procedures for IEEE 802.11 distributed coordination function Ping Zhong, Xuemin Hong, Xiaofang Wu, Jianghong Shi a), and Huihuang Chen School of Information Science
More informationPerformance Analysis of WLANs Under Sporadic Traffic
Performance Analysis of 802.11 WLANs Under Sporadic Traffic M. Garetto and C.-F. Chiasserini Dipartimento di Elettronica, Politecnico di Torino, Italy Abstract. We analyze the performance of 802.11 WLANs
More informationA DEVS LIBRARY FOR LAYERED QUEUING NETWORKS
A DEVS LIBRARY FOR LAYERED QUEUING NETWORKS Dorin B. Petriu and Gabriel Wainer Department of Systems and Computer Engineering Carleton University, 1125 Colonel By Drive Ottawa, Ontario K1S 5B6, Canada.
More informationStudy of Load Balancing Schemes over a Video on Demand System
Study of Load Balancing Schemes over a Video on Demand System Priyank Singhal Ashish Chhabria Nupur Bansal Nataasha Raul Research Scholar, Computer Department Abstract: Load balancing algorithms on Video
More informationOperating Systems Unit 3
Unit 3 CPU Scheduling Algorithms Structure 3.1 Introduction Objectives 3.2 Basic Concepts of Scheduling. CPU-I/O Burst Cycle. CPU Scheduler. Preemptive/non preemptive scheduling. Dispatcher Scheduling
More informationMemory Allocation. Copyright : University of Illinois CS 241 Staff 1
Memory Allocation Copyright : University of Illinois CS 241 Staff 1 Allocation of Page Frames Scenario Several physical pages allocated to processes A, B, and C. Process B page faults. Which page should
More informationA Quality of Service Decision Model for ATM-LAN/MAN Interconnection
A Quality of Service Decision for ATM-LAN/MAN Interconnection N. Davies, P. Francis-Cobley Department of Computer Science, University of Bristol Introduction With ATM networks now coming of age, there
More informationSmall verse Large. The Performance Tester Paradox. Copyright 1202Performance
Small verse Large The Performance Tester Paradox The Paradox Why do people want performance testing? To stop performance problems in production How do we ensure this? Performance test with Realistic workload
More informationINTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET)
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976-6480 (Print) ISSN 0976-6499 (Online) Volume 4, Issue 1, January- February (2013), pp. 50-58 IAEME: www.iaeme.com/ijaret.asp
More informationMean Value Analysis and Related Techniques
Mean Value Analysis and Related Techniques 34-1 Overview 1. Analysis of Open Queueing Networks 2. Mean-Value Analysis 3. Approximate MVA 4. Balanced Job Bounds 34-2 Analysis of Open Queueing Networks Used
More informationScheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures
Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures Alaa Amin and Reda Ammar Computer Science and Eng. Dept. University of Connecticut Ayman El Dessouly Electronics
More informationLecture 9: MIMD Architecture
Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is
More informationADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT
ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision
More informationECE519 Advanced Operating Systems
IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (10 th Week) (Advanced) Operating Systems 10. Multiprocessor, Multicore and Real-Time Scheduling 10. Outline Multiprocessor
More informationModular Petri Net Processor for Embedded Systems
Modular Petri Net Processor for Embedded Systems Orlando Micolini 1, Emiliano N. Daniele, Luis O. Ventre Laboratorio de Arquitectura de Computadoras (LAC) FCEFyN Universidad Nacional de Córdoba orlando.micolini@unc.edu.ar,
More informationJournal of Electronics and Communication Engineering & Technology (JECET)
Journal of Electronics and Communication Engineering & Technology (JECET) JECET I A E M E Journal of Electronics and Communication Engineering & Technology (JECET)ISSN ISSN 2347-4181 (Print) ISSN 2347-419X
More informationCOURSE 12. Parallel DBMS
COURSE 12 Parallel DBMS 1 Parallel DBMS Most DB research focused on specialized hardware CCD Memory: Non-volatile memory like, but slower than flash memory Bubble Memory: Non-volatile memory like, but
More informationCHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song
CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed
More informationShaping Process Semantics
Shaping Process Semantics [Extended Abstract] Christoph M. Kirsch Harald Röck Department of Computer Sciences University of Salzburg, Austria {ck,hroeck}@cs.uni-salzburg.at Analysis. Composition of virtually
More informationPerformance measurement. SMD149 - Operating Systems - Performance and processor design. Introduction. Important trends affecting performance issues
Performance measurement SMD149 - Operating Systems - Performance and processor design Roland Parviainen November 28, 2005 Performance measurement Motivation Techniques Common metrics Processor architectural
More informationLast Class: Processes
Last Class: Processes A process is the unit of execution. Processes are represented as Process Control Blocks in the OS PCBs contain process state, scheduling and memory management information, etc A process
More informationModeling VMware ESX Server Performance A Technical White Paper. William L. Shelden, Jr., Ph.D Sr. Systems Analyst
Modeling VMware ESX Server Performance A Technical White Paper William L. Shelden, Jr., Ph.D Sr. Systems Analyst Modeling VMware ESX Server Performance William L. Shelden, Jr., Ph.D. The Information Systems
More informationOVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI
CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing
More informationUnit 9 : Fundamentals of Parallel Processing
Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing
More informationINTEGRATION of data communications services into wireless
208 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL 54, NO 2, FEBRUARY 2006 Service Differentiation in Multirate Wireless Networks With Weighted Round-Robin Scheduling and ARQ-Based Error Control Long B Le, Student
More informationIntegration of analytic model and simulation model for analysis on system survivability
6 Integration of analytic model and simulation model for analysis on system survivability Jang Se Lee Department of Computer Engineering, Korea Maritime and Ocean University, Busan, Korea Summary The objective
More informationImpact of End-to-end QoS Connectivity on the Performance of Remote Wireless Local Networks
Impact of End-to-end QoS Connectivity on the Performance of Remote Wireless Local Networks Veselin Rakocevic School of Engineering and Mathematical Sciences City University London EC1V HB, UK V.Rakocevic@city.ac.uk
More informationA Heuristic Approach to the Design of Kanban Systems
A Heuristic Approach to the Design of Kanban Systems Chuda Basnet Department of Management Systems University of Waikato, Hamilton Abstract Kanbans are often used to communicate replenishment requirements
More informationFuture-ready IT Systems with Performance Prediction using Analytical Models
Future-ready IT Systems with Performance Prediction using Analytical Models Madhu Tanikella Infosys Abstract Large and complex distributed software systems can impact overall software cost and risk for
More informationCPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017
CPSC 531: System Modeling and Simulation Carey Williamson Department of Computer Science University of Calgary Fall 2017 Recap: Simulation Model Taxonomy 2 Recap: DES Model Development How to develop a
More informationHeuristic Algorithms for Multiconstrained Quality-of-Service Routing
244 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 10, NO 2, APRIL 2002 Heuristic Algorithms for Multiconstrained Quality-of-Service Routing Xin Yuan, Member, IEEE Abstract Multiconstrained quality-of-service
More informationHPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:
HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com
More informationOutline. Application examples
Outline Application examples Google page rank algorithm Aloha protocol Virtual circuit with window flow control Store-and-Forward packet-switched network Interactive system with infinite servers 1 Example1:
More informationB.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2
Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,
More informationPerformance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems
Appears in Proc. Conf. Parallel and Distributed Computing Systems, Dijon, France, 199. Performance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems
More informationEffective Load Sharing on Heterogeneous Networks of Workstations
Effective Load Sharing on Heterogeneous Networks of Workstations Li Xiao Xiaodong Zhang Yanxia Qu Department of Computer Science College of William and Mary Williamsburg, VA 387-8795 flxiao, zhangg@cs.wm.edu
More informationEP2200 Queueing theory and teletraffic systems
EP2200 Queueing theory and teletraffic systems Viktoria Fodor Laboratory of Communication Networks School of Electrical Engineering Lecture 1 If you want to model networks Or a complex data flow A queue's
More informationA Quantitative Model for Capacity Estimation of Products
A Quantitative Model for Capacity Estimation of Products RAJESHWARI G., RENUKA S.R. Software Engineering and Technology Laboratories Infosys Technologies Limited Bangalore 560 100 INDIA Abstract: - Sizing
More informationA Decoupled Scheduling Approach for the GrADS Program Development Environment. DCSL Ahmed Amin
A Decoupled Scheduling Approach for the GrADS Program Development Environment DCSL Ahmed Amin Outline Introduction Related Work Scheduling Architecture Scheduling Algorithm Testbench Results Conclusions
More informationComputational performance and scalability of large distributed enterprise-wide systems supporting engineering, manufacturing and business applications
Computational performance and scalability of large distributed enterprise-wide systems supporting engineering, manufacturing and business applications Janusz S. Kowalik Mathematics and Computing Technology
More informationLink Lifetime Prediction in Mobile Ad-Hoc Network Using Curve Fitting Method
IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.5, May 2017 265 Link Lifetime Prediction in Mobile Ad-Hoc Network Using Curve Fitting Method Mohammad Pashaei, Hossein Ghiasy
More informationUniprocessor Scheduling. Basic Concepts Scheduling Criteria Scheduling Algorithms. Three level scheduling
Uniprocessor Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Three level scheduling 2 1 Types of Scheduling 3 Long- and Medium-Term Schedulers Long-term scheduler Determines which programs
More informationAnalysis of Air Transportation Network Delays Using Stochastic Modeling
Arun Shankar Analysis of Air Transportation Network Delays Using Stochastic Modeling Abstract We model the air traffic of 1 airports (each w/1 gate) with a closed Jackson queuing network using various
More informationUser Based Call Admission Control Policies for Cellular Mobile Systems: A Survey
User Based Call Admission Control Policies for Cellular Mobile Systems: A Survey Hamid Beigy and M. R. Meybodi Computer Engineering Department Amirkabir University of Technology Tehran, Iran {beigy, meybodi}@ce.aut.ac.ir
More informationSimulation of Task Graph Systems in Heterogeneous Computing Environments
Simulation of Task Graph Systems in Heterogeneous Computing Environments Noe Lopez-Benitez and Ja-Young Hyon Department of Computer Science College of Engineering Te xas Tech University Lubbock, Texas
More informationOn the Relationship of Server Disk Workloads and Client File Requests
On the Relationship of Server Workloads and Client File Requests John R. Heath Department of Computer Science University of Southern Maine Portland, Maine 43 Stephen A.R. Houser University Computing Technologies
More informationHomework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization
ECE669: Parallel Computer Architecture Fall 2 Handout #2 Homework # 2 Due: October 6 Programming Multiprocessors: Parallelism, Communication, and Synchronization 1 Introduction When developing multiprocessor
More information