TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR. Peter Luksch, Holger Weitlich

Size: px
Start display at page:

Download "TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR. Peter Luksch, Holger Weitlich"

Transcription

1 TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR ABSTRACT Peter Luksch, Holger Weitlich Department of Computer Science, Munich University of Technology P.O. Box, D-W-8-Munchen, Germany phone: ; fax: ; luksch@informatik.tu-muenchen.de Germany to appear in SCS European Simulation Conference, Lyon, June 7--9, 99 In this paper we describe a Time Warp based parallel implementation of an event driven logic simulator on a distributed memory multiprocessor (ipsc/86). The basic Time Warp mechanism has been complemented with an optimized method for incremental state saving and a mechanism that optimizes re-simulation of a rolled back period of simulated time which is especially worthwhile for complex elements. In addition to static partitioning where elements are distributed to partitions either randomly or by using a min-cut algorithm dynamic repartitioning is possible in our implementation. For our measurements, we used a set of wellknown benchmark circuits. Speedups showed to be strongly dependent on the circuit being simulated, its input stimuli and on the way circuits are partitioned. However, one observation has been made with most of the workloads: The simulators' lvt's tend to diverge extremely throughout the simulation. Even though memory requirements for state saving have been minimized, simulators whose lvt is far ahead of the other processes run out memory for larger circuits. We therefore had to limit Time Warp's optimism by preventing simulators from getting too far ahead of gvt. THE TIME WARP MECHANISM The Time Warp mechanism [Jeerson, 98] is an optimistic synchronisation protocol which can be used to synchronize parallel discrete event simulation that is based on model partitioning. The protocol, however, is not restricted to this application. Each process simulates a partition of the circuit's elements and has its own simulation time (lvt) and event list. Whenever it generates an event aecting a signal that connects to remote partitions, it sends an event message to the corresponding processes. Processes simulate their partition based on their current This work has been partially funded by the DFG (\Deutsche Forschungsgemeinschaft", German science foundation) under contract No. SFB, TP A information about signal values, which, however, may be incorrect because events with time stamps in the local past of the receiving simulators may arrive from remote partitions. Such an event message is referred to as a straggler. Stragglers as well as anti messages (i.e. messages informing a process about the incorrectness of an event message) cause the simulator to roll back, i.e. return to the point where simulation began to be incorrect. Rollback involves restoration of the local state information for the point in simulated time to which the simulator rolls back and the cancellation of all messages sent in the rolled back period. Therefore, the simulation process has to store information about its local state and the messages it has sent. One method to undo messages is to send anti messages immediately upon rollback (aggressive cancellation). The alternative approach, lazy cancellation, is based on the observation that a large portion of events can be expected to be generated again during re-simulation of the rolled back time interval. Therefore, an anti message is sent only if the corresponding event is guaranteed not to be generated again. In our implementation, we use lazy cancellation. Global progress of a Time Warp simulation is measured by the global virtual time (gvt), which is the minimum of the local simulation times and the time stamps of all un-processed events in the system. A number of gvt algorithms have been proposed in literature [Samadi, 98, Lin & Lazowska, 99, Bauer & Sporrer, 99]. TIME WARP PARALLELIZATION OF A LOGIC SIMULATOR The basis for our parallel implementation is a logic simulator for the gate level that implements most of today's state-of-the art techniques in the modelling of digital circuits [Krodel & Antreich, 99]. It uses a six-valued logic and allows for ambiguity delays to be modelled explicitly. The program is written in c for

2 a unix environment. The parallel program has been implemented on ipsc/86 and ipsc/ multiprocessors using mmk, a parallel programming library designed within project SFB, TP A. communication rate [MB/sec]. MMK remote send operation ipsc/86 Communication. Figure displays the performance of the communication system as a function of message length. For each message there is a signicant startup time of s on the ipsc/86 and ms on the ipsc/ which is independent from message length. The latency is due to the circuit switched message passing on the ipsc's. Therefore, a given amount of information should be transferred using few long messages instead of many short ones, i.e. several event messages have to be combined into one message that is transmitted by the communication system. On the other hand the synchronisation protocol requires remote partitions to be be informed about events as soon as possible in order to prevent simulators from having to roll back over long periods of time because because they were informed too late about the incorrectness of their computation. In our implementation communication is controlled by a buering mechanism that accounts for both of these conicting requirements. There is one buer for each remote partition where event messages for the corresponding partition are written to. After each step (i.e. one iteration in the loop of signal value update followed by the evaluation of fanout elements) buers are checked whether they have reached some minimum length or contain events that have been generated more than a maximum number of steps before. The number or steps the simulator executes while an event stays in the buer is referred to as the event's age. If a buer is long enough or contains events that have been held in the buer for too long a time, its contents is sent. Synchronisation eciency can be optimized for a given multiprocessor system by adjusting these parameters. State Saving. The state of a simulator can be saved either periodically as a whole (checkpointing) or incrementally by storing state changes. Since in logic simulation each event changes only a very small portion of the state, checkpointing would result in inecient memory usage. Moreover, the target system, like most of today's parallel computers, has only limited physical and no virtual memory on the. As status information is quite large when simulating big circuits, incremental state saving has to be used. Memory requirements are reduced further by saving only the rst change of a signal value that occurs in processing an active point in simulated time. Since a rolled back point in simulated time always is resimulated completely, this state information is sucient.... ipsc/ message length [kb] Figure : mmk: communication performance as a function of message length Global Virtual Time. We have implemented two gvt algorithms: Samadi's gvt [Samadi, 98] and an algorithm proposed by Lin and Lazowska [Lin & Lazowska, 99]. In contrast to the algorithm by Lin/Lazowska, Samadi's simple gvt algorithm requires all processes to stop simulation during gvt computation. Our implementation of inter-simulator communication permits processes to continue local simulation. However, they must refrain from sending any event messages during gvt computation. Simulation with Samadi's algorithm showed to be faster than with Lin/Lazowska's method because the latter requires more messages to be sent. Optimized Re-Simulation after Rollback. Lazy cancellation is based on the optimistic assumption that most events will be generated again when resimulating the rolled back interval. While lazy cancellation prevents unnecessary re-evaluations in remote partitions, local computation is redone completely. For complex elements like PLA's or even microprocessors it is desirable to avoid unnecessary evaluations in the local partition, too. In order to skip renewed element evaluation, the simulator must know the events that have been generated upon evaluation of the element under consideration in the preceding simulation, i.e. the causality relation between events needs to be stored during \normal" simulation. For each event that is executed, pointers to the events that have been generated when evaluating the fanout elements of the signal that is aected by the event are stored together with the identity of the fanout element whose evaluation caused them to be created, the element's input signal values and its internal state (if any). During rollback the simulator marks local events instead of deleting them as it is done in the basic Time Warp mechanism. If during re-simulation of a rolled back period in simulated

3 time an element is up to be evaluated due to an event that has occurred in the previous simulation, too, the simulator has to check whether the element's current inputs and its internal state are the same as just before the corresponding evaluation in the previous simulation. If so, the element need not be evaluated. Instead, the events caused by its previous evaluation can be re-scheduled. Partitioning. Before simulation, circuits are partitioned based on their topology. We use random partitioning and a min-cut algorithm which is a generalization of Fiduccia's and Mattheyses' bipartitioning method [Vijayan, 989]. At runtime, dynamic repartitioning allows to take into account the activity of elements and signals in order to distribute work evenly among the processors. Each simulator reports to the gvt process the time of the earliest un-processed event in its partition that it knows about. These time stamps reect the simulator's load. A simulator reporting a low time stamp lags far behind the others in its simulation, i.e. it is heavily loaded. A lightly loaded simulator will advance its LVT quickly and thus report a high time stamp. In principle, elements should be moved from the \slowest" simulator to the \fastest" simulator. Elements are selected according to their complexity and their activity. In addition, the eect of possible element migrations on communication topology must be taken into account. Time Warp synchronisation introduces an additional problem: in order to be able to roll back simulation, a simulator whose partition has been assigned new elements has to know state information associated with signals connected to these elements. If a signal has not been in the partition before repartitioning its \history" must be transferred, too. In our implementation the gvt process tells the \slowest" simulator to move elements out of its partition. This process will determine the target partition according to the other simulators' load values (provided by the gvt process) and the number of events that to the other simulator in the past. It selects elements whose outputs are already in the receiver's partition, whenever this is possible. For each element being a candidate to be moved the eect that moving it would have on communication topology is considered. Migrations resulting in minimal communication costs (i.e. number of interpartition signals weighted by their activities) are preferred. Also, moving few highly active elements is preferred to moving more but less active elements. Element and signal activities are measured by counting the number of evaluations of each element and the number of events for each signal. EXPERIMENTAL RESULTS The parallel simulator has been run with several of the ISCAS-89 benchmark circuits. Performance measurements were done by source code instrumentation. Times for dierent subtasks were measured using the ipsc's hardware clock. Additional statistics were collected by counters. Dynamic behaviour of LVT's on dierent was observed using the topsys software monitor [Bemmerl et al., 99]. In most of our simulation runs the simulators' lvt's have diverged extremely. During simulation of, units of time lvt's diverge by up to more than, time units, i.e. nearly half the total period being simulated. Even though memory consumption has been minimized by incremental state saving, simulators run out of memory for larger circuits or longer input sequences. We therefore had to limit Time Warp's optimism by preventing simulators from advancing their lvt's too far ahead of gvt by suspending simulation if a maximum value for memory consumption is exceeded. Speedup. Speedup does not scale linearly with the number of simulators. Instead curves show peaks and valleys (see gure ). Despite being not a straight line, the curve clearly has a positive slope. In addition to speedup the following statistics are displayed: the time that is spent in rollback, the time for communication and for processing extern events and the time during which the simulation is suspended to prevent processes from running out of memory. For each measurement (i.e. number of simulators), the gure displays the maximum value of all partitions involved. There is a clear correlation between good speedup and low rollback costs and simulation being suspended rarely. The correspondence between peaks in speedup and valleys in communication is less distinct. For more than two, memory consumption for state information always reaches the limits set by the ' physical memory capacity. We have also gathered statistics on communication. Having set the parameters for event message buering to a maximum event age of and a minimum message length of events, we found average message length to be in the range of to kb for the simulation of c. For this message length, eective bandwidth is still far below its maximum value (see g. ). Communication performance can be optimized by increasing the maximum event age parameter. For larger circuits, however, the buer length can be expected to increase since the number of events that are generated in each simulation step will increase as partitions get larger. For all our test runs, only the time for the simula-

4 simulated time GVT LVT LVT LVT LVT GVT and LVT s trace (TOPSYS software monitoring) circuit: c clock resolution: ms real time [sec] Figure : GVT and LVT's vs. real time (trace generated by topsys software monitor) tion proper has been measured. Input and output les had to be accessed using Intel's remote hosting software. Therefore, i/o has been extremely slow. Unfortunately it was impossible to use the concurrent le system (cfs) because its use is not supported by mmk. However, since parallelization aims at acellerating computations, not I/O, ommiting I/O times seems to be justied for the evaluation of a synchronisation protocol. Monitoring LVT's and GVT. lvt's and gvt have been observed with the help of topsys' distributed monitoring system. An inspect task on one node periodically broadcasts display commands for the lvt and gvt variables to the and stores their replies in a buer that is written to le after simulation has nished. The monitoring technique provides the best possible approximation to a global time base in the distributed memory multiprocessor. Figure shows a trace from the simulation of c. Samadi's simple algorithm has shown to approximate gvt suciently good. Its main benet is the small number of messages per gvt computation. Its disadvantage of having to stop simulation during gvt computation is mitigated by event message buering which allows local simulation to proceed if no event messages are sent while processes are computing their local minima. Optimized Simulation after Rollback. Reducing the number of element evaluations during resimulation of a rolled back period of simulated time can signicantly increase Time Warp's performance if elements are complex to evaluate. Its benet, however, varies strongly with the number of processes in the parallel simulation. For an element evaluation time of ms the maximum increase in speedup that we have observed in the simulation of c is a factor of more than two (see g. ) For some numbers of partitions there was, however, no noticeable benet from optimized re-simulation. CONCLUSIONS AND FUTURE WORK Measurements have shown that Time Warp's ef- ciency strongly depends on an equal distribution of computation load on processes. Although elements have been evenly distributed on processes in static partitioning lvt's diverge extremely. This observation emphasizes the need for dynamic repartitioning. We have not yet been able to analyze Time Warp's behaviour and the eects of our optimizations comprehensively because a detailed study requires a very large number of measurements to be carried out where each of the numerous parameters impacting TW's performance is modied in a controlled way. However, program development and performance measurements were impeded by the fact that our ipsc's have been very unreliable for more than a year now (and still are). Hoping for the system's reliability to improve in the future we intend to carry out more measurements especially in order to evaluate our optimizations to the basic Time Warp mechanism.

5 speedup speedup basic Time Warp method optimized re-simulation rollback time [sec] communication + processing extern events [sec] simulation suspended [sec] Figure : simulation of c (multiple delays) REFERENCES [Bauer & Sporrer, 99] Bauer, H. & Sporrer, C. (99). Distributed Logic Simulation and an Approach to Asynchronous GVT-Calculation. In Proceedings of the 99 circuit: c (unit delay), element evaluation: ms Figure : The eect of optimized re-simulation SCS Western Simulation Multiconference on Parallel and Distributed Simulation (PADS9) (pp. {9). Newport Beach, California. [Bemmerl et al., 99] Bemmerl, T., Lindhof, R., & Treml, T. (99). The Distributed Monitor System of TOPSYS. In H. Burkhart (Ed.), Proceedings of CON- PAR9 VAPP IV, volume 7 of LNCS (pp. 76{76). Zurich, Schweiz: Springer-Verlag. [Jeerson, 98] Jeerson, D. (98). Virtual Time. ACM Transactions on Programming Languages and Systems, 7(), {. [Krodel & Antreich, 99] Krodel, T. & Antreich, K. (99). An Accurate Model for Ambiguity Delay Simulation. In 7th ACM/IEEE Design Automation Conference (pp. {7). [Lin & Lazowska, 99] Lin, Y.-B. & Lazowska, E. (99). Determining the Global Virtual Time in a Distributed Simulation. In Proceedings of the 99 International Conference on Parallel Processing, volume III (pp. {9). [Luksch, 99] Luksch, P. (99). Parallele Logiksimulation auf Multiprozessoren mit verteiltem Speicher. In H. Fuss & P. Schwarz (Eds.), 8. Workshop Simulationsmethoden und -Sprachen fur verteilte Systeme und parallele Prozesse, volume 7 of ASIM-Mitteilungen Dresden: ASIM. [Samadi, 98] Samadi, B. (98). Distributed Simulation, Algorithms and Performance Analysis. Technical Report, University of California, Los Angeles, (UCLA). [Vijayan, 989] Vijayan, G. (989). Min-Cost Partitioning on a Tree Structure and Applications. In 6th ACM/IEEE Design Automation Conference (pp. 77{ 77). [Weitlich, 99] Weitlich, H. (99). Parallele Logiksimulation nach der Time-Warp-Methode auf einem Multiprozessorsystem mit verteiltem Speicher. Diplomarbeit, Technische Universitat Munchen, Institut fur Informatik, Munchen.

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli Eect of fan-out on the Performance of a Single-message cancellation scheme Atul Prakash (Contact Author) Gwo-baw Wu Seema Jetli Department of Electrical Engineering and Computer Science University of Michigan,

More information

Rollback Overhead Reduction Methods for Time Warp Distributed Simulation

Rollback Overhead Reduction Methods for Time Warp Distributed Simulation Rollback Overhead Reduction Methods for Time Warp Distributed Simulation M.S. Balsamo and C. Manconi* Dipartimento di Matematica e Informatica, University of Udine Vial delle Scienze 108, Udine, Italy,

More information

χ=5 virtual time state LVT entirely saved state partially saved state χ=5 ν=2 virtual time state LVT entirely saved partially saved unsaved state

χ=5 virtual time state LVT entirely saved state partially saved state χ=5 ν=2 virtual time state LVT entirely saved partially saved unsaved state ROLLBACK-BASED PARALLEL DISCRETE EVENT SIMULATION BY USING HYBRID STATE SAVING Francesco Quaglia Dipartimento di Informatica e Sistemistica, Universita di Roma "La Sapienza" Via Salaria 113, 00198 Roma,

More information

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax:

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax: Consistent Logical Checkpointing Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 hone: 409-845-0512 Fax: 409-847-8578 E-mail: vaidya@cs.tamu.edu Technical

More information

An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation

An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation 230 The International Arab Journal of Information Technology, Vol. 6, No. 3, July 2009 An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation Ali Al-Humaimidi and Hussam Ramadan

More information

Computing Global Virtual Time!

Computing Global Virtual Time! Computing Global Virtual Time Issues and Some Solutions Richard M. Fujimoto Professor Computational Science and Engineering Division College of Computing Georgia Institute of Technology Atlanta, GA 30332-0765,

More information

Event List Management In Distributed Simulation

Event List Management In Distributed Simulation Event List Management In Distributed Simulation Jörgen Dahl ½, Malolan Chetlur ¾, and Philip A Wilsey ½ ½ Experimental Computing Laboratory, Dept of ECECS, PO Box 20030, Cincinnati, OH 522 0030, philipwilsey@ieeeorg

More information

Comparative Analysis of Periodic State Saving Techniques in Time. Warp Simulators. Center for Digital Systems Engineering. Cincinnati, Ohio

Comparative Analysis of Periodic State Saving Techniques in Time. Warp Simulators. Center for Digital Systems Engineering. Cincinnati, Ohio This paper appeared in the Proceedings of the 9th Workshop on Parallel and Distributed Simulation, PADS-1995. c 1995, IEEE. Personal use of this material is permitted. However, permission to reprint or

More information

Parallel and Distributed VHDL Simulation

Parallel and Distributed VHDL Simulation Parallel and Distributed VHDL Simulation Dragos Lungeanu Deptartment of Computer Science University of Iowa C.J. chard Shi Department of Electrical Engineering University of Washington Abstract This paper

More information

Blocking vs. Non-blocking Communication under. MPI on a Master-Worker Problem. Institut fur Physik. TU Chemnitz. D Chemnitz.

Blocking vs. Non-blocking Communication under. MPI on a Master-Worker Problem. Institut fur Physik. TU Chemnitz. D Chemnitz. Blocking vs. Non-blocking Communication under MPI on a Master-Worker Problem Andre Fachat, Karl Heinz Homann Institut fur Physik TU Chemnitz D-09107 Chemnitz Germany e-mail: fachat@physik.tu-chemnitz.de

More information

Parallel Logic Simulation of VLSI Systems

Parallel Logic Simulation of VLSI Systems Parallel Logic Simulation of VLSI Systems Roger D. Chamberlain Computer and Communications Research Center Department of Electrical Engineering Washington University, St. Louis, Missouri Abstract Design

More information

Event Reconstruction in Time Warp

Event Reconstruction in Time Warp Event Reconstruction in Time Warp Lijun Li and Carl Tropper School of Computer Science McGill University Montreal, Canada lli22, carl@cs.mcgill.ca Abstract In optimistic simulations, checkpointing techniques

More information

Other Optimistic Mechanisms, Memory Management!

Other Optimistic Mechanisms, Memory Management! Other Optimistic Mechanisms, Memory Management! Richard M. Fujimoto! Professor!! Computational Science and Engineering Division! College of Computing! Georgia Institute of Technology! Atlanta, GA 30332-0765,

More information

Study of a Multilevel Approach to Partitioning for Parallel Logic Simulation Λ

Study of a Multilevel Approach to Partitioning for Parallel Logic Simulation Λ Study of a Multilevel Approach to Partitioning for Parallel Logic Simulation Λ Swaminathan Subramanian, Dhananjai M. Rao,andPhilip A. Wilsey Experimental Computing Laboratory, Cincinnati, OH 45221 0030

More information

COMPILED CODE IN DISTRIBUTED LOGIC SIMULATION. Jun Wang Carl Tropper. School of Computer Science McGill University Montreal, Quebec, CANADA H3A2A6

COMPILED CODE IN DISTRIBUTED LOGIC SIMULATION. Jun Wang Carl Tropper. School of Computer Science McGill University Montreal, Quebec, CANADA H3A2A6 Proceedings of the 2006 Winter Simulation Conference L. F. Perrone, F. P. Wieland, J. Liu, B. G. Lawson, D. M. Nicol, and R. M. Fujimoto, eds. COMPILED CODE IN DISTRIBUTED LOGIC SIMULATION Jun Wang Carl

More information

Optimistic Parallel Simulation of TCP/IP over ATM networks

Optimistic Parallel Simulation of TCP/IP over ATM networks Optimistic Parallel Simulation of TCP/IP over ATM networks M.S. Oral Examination November 1, 2000 Ming Chong mchang@ittc.ukans.edu 1 Introduction parallel simulation ProTEuS Agenda Georgia Tech. Time Warp

More information

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia

More information

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals Performance Evaluations of a Multithreaded Java Microcontroller J. Kreuzinger, M. Pfeer A. Schulz, Th. Ungerer Institute for Computer Design and Fault Tolerance University of Karlsruhe, Germany U. Brinkschulte,

More information

Chair for Network Architectures and Services Prof. Carle Department of Computer Science TU München. Parallel simulation

Chair for Network Architectures and Services Prof. Carle Department of Computer Science TU München. Parallel simulation Chair for Network Architectures and Services Prof. Carle Department of Computer Science TU München Parallel simulation Most slides/figures borrowed from Richard Fujimoto Parallel simulation: Summary/Outline

More information

messages from disque to parsim messages from parsim to disque

messages from disque to parsim messages from parsim to disque Extension to DISQUE - A trace facility to produce trace data for use by a monitoring tool for distributed simulators Gerd Meister Department of Computer Science, University of Kaiserslautern P.O.Box 3049,

More information

Application Programm 1

Application Programm 1 A Concept of Datamigration in a Distributed, Object-Oriented Knowledge Base Oliver Schmid Research Institute for Robotic and Real-Time Systems, Department of Computer Science, Technical University of Munich,

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

Logged Virtual Memory. David R. Cheriton and Kenneth J. Duda. Computer Science Department. Stanford University. Stanford, CA 94305

Logged Virtual Memory. David R. Cheriton and Kenneth J. Duda. Computer Science Department. Stanford University. Stanford, CA 94305 Logged Virtual Memory David R. Cheriton and Kenneth J. Duda Computer Science Department Stanford University Stanford, CA 9435 fcheriton,kjdg@cs.stanford.edu Abstract Logged virtual memory (LVM) provides

More information

residual residual program final result

residual residual program final result C-Mix: Making Easily Maintainable C-Programs run FAST The C-Mix Group, DIKU, University of Copenhagen Abstract C-Mix is a tool based on state-of-the-art technology that solves the dilemma of whether to

More information

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed

More information

Multi-Version Caches for Multiscalar Processors. Manoj Franklin. Clemson University. 221-C Riggs Hall, Clemson, SC , USA

Multi-Version Caches for Multiscalar Processors. Manoj Franklin. Clemson University. 221-C Riggs Hall, Clemson, SC , USA Multi-Version Caches for Multiscalar Processors Manoj Franklin Department of Electrical and Computer Engineering Clemson University 22-C Riggs Hall, Clemson, SC 29634-095, USA Email: mfrankl@blessing.eng.clemson.edu

More information

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme On Checkpoint Latency Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 E-mail: vaidya@cs.tamu.edu Web: http://www.cs.tamu.edu/faculty/vaidya/ Abstract

More information

is developed which describe the mean values of various system parameters. These equations have circular dependencies and must be solved iteratively. T

is developed which describe the mean values of various system parameters. These equations have circular dependencies and must be solved iteratively. T A Mean Value Analysis Multiprocessor Model Incorporating Superscalar Processors and Latency Tolerating Techniques 1 David H. Albonesi Israel Koren Department of Electrical and Computer Engineering University

More information

Optimistic Distributed Simulation Based on Transitive Dependency. Tracking. Dept. of Computer Sci. AT&T Labs-Research Dept. of Elect. & Comp.

Optimistic Distributed Simulation Based on Transitive Dependency. Tracking. Dept. of Computer Sci. AT&T Labs-Research Dept. of Elect. & Comp. Optimistic Distributed Simulation Based on Transitive Dependency Tracking Om P. Damani Yi-Min Wang Vijay K. Garg Dept. of Computer Sci. AT&T Labs-Research Dept. of Elect. & Comp. Eng Uni. of Texas at Austin

More information

1 PERFORMANCE ANALYSIS OF SUPERCOMPUTING ENVIRONMENTS. Department of Computer Science, University of Illinois at Urbana-Champaign

1 PERFORMANCE ANALYSIS OF SUPERCOMPUTING ENVIRONMENTS. Department of Computer Science, University of Illinois at Urbana-Champaign 1 PERFORMANCE ANALYSIS OF TAPE LIBRARIES FOR SUPERCOMPUTING ENVIRONMENTS Ilker Hamzaoglu and Huseyin Simitci Department of Computer Science, University of Illinois at Urbana-Champaign {hamza, simitci}@cs.uiuc.edu

More information

A taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA

A taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA A taxonomy of race conditions. D. P. Helmbold, C. E. McDowell UCSC-CRL-94-34 September 28, 1994 Board of Studies in Computer and Information Sciences University of California, Santa Cruz Santa Cruz, CA

More information

Parallel Discrete Event Simulation

Parallel Discrete Event Simulation Parallel Discrete Event Simulation Dr.N.Sairam & Dr.R.Seethalakshmi School of Computing, SASTRA Univeristy, Thanjavur-613401. Joint Initiative of IITs and IISc Funded by MHRD Page 1 of 8 Contents 1. Parallel

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

Just-In-Time Cloning

Just-In-Time Cloning Just-In-Time Cloning Maria Hybinette Computer Science Department University of Georgia Athens, GA 30602-7404, USA maria@cs.uga.edu Abstract In this work we focus on a new technique for making cloning of

More information

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate

More information

Networks. Wu-chang Fengy Dilip D. Kandlurz Debanjan Sahaz Kang G. Shiny. Ann Arbor, MI Yorktown Heights, NY 10598

Networks. Wu-chang Fengy Dilip D. Kandlurz Debanjan Sahaz Kang G. Shiny. Ann Arbor, MI Yorktown Heights, NY 10598 Techniques for Eliminating Packet Loss in Congested TCP/IP Networks Wu-chang Fengy Dilip D. Kandlurz Debanjan Sahaz Kang G. Shiny ydepartment of EECS znetwork Systems Department University of Michigan

More information

Checkpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions

Checkpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions Checkpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions D. Manivannan Department of Computer Science University of Kentucky Lexington, KY 40506

More information

DISTRIBUTED SELF-SIMULATION OF HOLONIC MANUFACTURING SYSTEMS

DISTRIBUTED SELF-SIMULATION OF HOLONIC MANUFACTURING SYSTEMS DISTRIBUTED SELF-SIMULATION OF HOLONIC MANUFACTURING SYSTEMS Naoki Imasaki I, Ambalavanar Tharumarajah 2, Shinsuke Tamura 3 J Toshiba Corporation, Japan, naoki.imasaki@toshiba.co.jp 2 CSIRO Manufacturing

More information

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Network. Department of Statistics. University of California, Berkeley. January, Abstract Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,

More information

This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July , IEEE Computer

This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July , IEEE Computer This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July 1-4 2002, IEEE Computer Society. Software Supports for Preemptive Rollback in Optimistic

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

Thrashing in Real Address Caches due to Memory Management. Arup Mukherjee, Murthy Devarakonda, and Dinkar Sitaram. IBM Research Division

Thrashing in Real Address Caches due to Memory Management. Arup Mukherjee, Murthy Devarakonda, and Dinkar Sitaram. IBM Research Division Thrashing in Real Address Caches due to Memory Management Arup Mukherjee, Murthy Devarakonda, and Dinkar Sitaram IBM Research Division Thomas J. Watson Research Center Yorktown Heights, NY 10598 Abstract:

More information

London SW7 2BZ. in the number of processors due to unfortunate allocation of the. home and ownership of cache lines. We present a modied coherency

London SW7 2BZ. in the number of processors due to unfortunate allocation of the. home and ownership of cache lines. We present a modied coherency Using Proxies to Reduce Controller Contention in Large Shared-Memory Multiprocessors Andrew J. Bennett, Paul H. J. Kelly, Jacob G. Refstrup, Sarah A. M. Talbot Department of Computing Imperial College

More information

1 Introduction A mobile computing system is a distributed system where some of nodes are mobile computers [3]. The location of mobile computers in the

1 Introduction A mobile computing system is a distributed system where some of nodes are mobile computers [3]. The location of mobile computers in the Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems Ravi Prakash and Mukesh Singhal Department of Computer and Information Science The Ohio State University Columbus, OH 43210. e-mail:

More information

IBM Almaden Research Center, at regular intervals to deliver smooth playback of video streams. A video-on-demand

IBM Almaden Research Center, at regular intervals to deliver smooth playback of video streams. A video-on-demand 1 SCHEDULING IN MULTIMEDIA SYSTEMS A. L. Narasimha Reddy IBM Almaden Research Center, 650 Harry Road, K56/802, San Jose, CA 95120, USA ABSTRACT In video-on-demand multimedia systems, the data has to be

More information

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those Parallel Computing on PC Clusters - An Alternative to Supercomputers for Industrial Applications Michael Eberl 1, Wolfgang Karl 1, Carsten Trinitis 1 and Andreas Blaszczyk 2 1 Technische Universitat Munchen

More information

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1 Delay Abstraction in Combinational Logic Circuits Noriya Kobayashi Sharad Malik C&C Research Laboratories Department of Electrical Engineering NEC Corp. Princeton University Miyamae-ku, Kawasaki Japan

More information

Algorithms Implementing Distributed Shared Memory. Michael Stumm and Songnian Zhou. University of Toronto. Toronto, Canada M5S 1A4

Algorithms Implementing Distributed Shared Memory. Michael Stumm and Songnian Zhou. University of Toronto. Toronto, Canada M5S 1A4 Algorithms Implementing Distributed Shared Memory Michael Stumm and Songnian Zhou University of Toronto Toronto, Canada M5S 1A4 Email: stumm@csri.toronto.edu Abstract A critical issue in the design of

More information

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics System Models Nicola Dragoni Embedded Systems Engineering DTU Informatics 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models Architectural vs Fundamental Models Systems that are intended

More information

Fault-Tolerant Computer Systems ECE 60872/CS Recovery

Fault-Tolerant Computer Systems ECE 60872/CS Recovery Fault-Tolerant Computer Systems ECE 60872/CS 59000 Recovery Saurabh Bagchi School of Electrical & Computer Engineering Purdue University Slides based on ECE442 at the University of Illinois taught by Profs.

More information

Parallel Pipeline STAP System

Parallel Pipeline STAP System I/O Implementation and Evaluation of Parallel Pipelined STAP on High Performance Computers Wei-keng Liao, Alok Choudhary, Donald Weiner, and Pramod Varshney EECS Department, Syracuse University, Syracuse,

More information

\Symbolic Debugging of. Charles E. McDowell. April University of California at Santa Cruz. Santa Cruz, CA abstract

\Symbolic Debugging of. Charles E. McDowell. April University of California at Santa Cruz. Santa Cruz, CA abstract A urther Note on Hennessy's \Symbolic ebugging of Optimized Code" Max Copperman Charles E. Mcowell UCSC-CRL-92-2 Supersedes UCSC-CRL-9-0 April 992 Board of Studies in Computer and Information Sciences

More information

GPU Implementation of a Multiobjective Search Algorithm

GPU Implementation of a Multiobjective Search Algorithm Department Informatik Technical Reports / ISSN 29-58 Steffen Limmer, Dietmar Fey, Johannes Jahn GPU Implementation of a Multiobjective Search Algorithm Technical Report CS-2-3 April 2 Please cite as: Steffen

More information

Scalability of a parallel implementation of ant colony optimization

Scalability of a parallel implementation of ant colony optimization SEMINAR PAPER at the University of Applied Sciences Technikum Wien Game Engineering and Simulation Scalability of a parallel implementation of ant colony optimization by Emanuel Plochberger,BSc 3481, Fels

More information

MANUFACTURING SIMULATION USING BSP TIME WARP WITH VARIABLE NUMBERS OF PROCESSORS

MANUFACTURING SIMULATION USING BSP TIME WARP WITH VARIABLE NUMBERS OF PROCESSORS MANUFACTURING SIMULATION USING BSP TIME WARP WITH VARIABLE NUMBERS OF PROCESSORS Malcolm Yoke Hean Low Programming Research Group, Computing Laboratory, University of Oxford Wolfson Building, Parks Road,

More information

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742 Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve

More information

1 Introduction Discrete-event simulation can be used to examine a variety of performance-related issues in complex systems. Parallel discrete-event si

1 Introduction Discrete-event simulation can be used to examine a variety of performance-related issues in complex systems. Parallel discrete-event si Experiments in Automated Load Balancing Linda F. Wilson Institute for Computer Applications in Science and Engineering Mail Stop 132C NASA Langley Research Center Hampton, Virginia 23681 David M. Nicol

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

Performance Evaluation of Two New Disk Scheduling Algorithms. for Real-Time Systems. Department of Computer & Information Science

Performance Evaluation of Two New Disk Scheduling Algorithms. for Real-Time Systems. Department of Computer & Information Science Performance Evaluation of Two New Disk Scheduling Algorithms for Real-Time Systems Shenze Chen James F. Kurose John A. Stankovic Don Towsley Department of Computer & Information Science University of Massachusetts

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,

More information

on Current and Future Architectures Purdue University January 20, 1997 Abstract

on Current and Future Architectures Purdue University January 20, 1997 Abstract Performance Forecasting: Characterization of Applications on Current and Future Architectures Brian Armstrong Rudolf Eigenmann Purdue University January 20, 1997 Abstract A common approach to studying

More information

Adaptive Methods for Distributed Video Presentation. Oregon Graduate Institute of Science and Technology. fcrispin, scen, walpole,

Adaptive Methods for Distributed Video Presentation. Oregon Graduate Institute of Science and Technology. fcrispin, scen, walpole, Adaptive Methods for Distributed Video Presentation Crispin Cowan, Shanwei Cen, Jonathan Walpole, and Calton Pu Department of Computer Science and Engineering Oregon Graduate Institute of Science and Technology

More information

Parallel Discrete Event Simulation

Parallel Discrete Event Simulation IEEE/ACM DS RT 2016 September 2016 Parallel Discrete Event Simulation on Data Processing Engines Kazuyuki Shudo, Yuya Kato, Takahiro Sugino, Masatoshi Hanai Tokyo Institute of Technology Tokyo Tech Proposal:

More information

Hardware Implementation of GA.

Hardware Implementation of GA. Chapter 6 Hardware Implementation of GA Matti Tommiska and Jarkko Vuori Helsinki University of Technology Otakaari 5A, FIN-02150 ESPOO, Finland E-mail: Matti.Tommiska@hut.fi, Jarkko.Vuori@hut.fi Abstract.

More information

Comparison of Priority Queue algorithms for Hierarchical Scheduling Framework. Mikael Åsberg

Comparison of Priority Queue algorithms for Hierarchical Scheduling Framework. Mikael Åsberg Comparison of Priority Queue algorithms for Hierarchical Scheduling Framework Mikael Åsberg mag04002@student.mdh.se August 28, 2008 2 The Time Event Queue (TEQ) is a datastructure that is part of the implementation

More information

Event Simulation Algorithms

Event Simulation Algorithms VLSI Design 1994, Vol. 2, No. 1, pp. 1-16 Reprints available directly from the publisher Photocopying permitted by license only (C) 1994 Gordon and Breach Science Publishers S.A. Printed in the United

More information

Combining MBP-Speculative Computation and Loop Pipelining. in High-Level Synthesis. Technical University of Braunschweig. Braunschweig, Germany

Combining MBP-Speculative Computation and Loop Pipelining. in High-Level Synthesis. Technical University of Braunschweig. Braunschweig, Germany Combining MBP-Speculative Computation and Loop Pipelining in High-Level Synthesis U. Holtmann, R. Ernst Technical University of Braunschweig Braunschweig, Germany Abstract Frequent control dependencies

More information

Distributed Simulation for Structural VHDL Netlists

Distributed Simulation for Structural VHDL Netlists Distributed Simulation for Structural VHDL Netlists Werner van Almsick 1, Wilfried Daehn 1, David Bernstein 2 1 SICAN GmbH, Germany 2 Vantage Analysis Systems, USA Abstract: This article describes the

More information

X /99/$ IEEE.

X /99/$ IEEE. Distributed Simulation of VLSI Systems via Lookahead-Free Self-Adaptive and Synchronization Dragos Lungeanu and C.-J. chard Shi Department of Electrical Engineering, University of Washington, Seattle WA

More information

THROUGHPUT IN THE DQDB NETWORK y. Shun Yan Cheung. Emory University, Atlanta, GA 30322, U.S.A. made the request.

THROUGHPUT IN THE DQDB NETWORK y. Shun Yan Cheung. Emory University, Atlanta, GA 30322, U.S.A. made the request. CONTROLLED REQUEST DQDB: ACHIEVING FAIRNESS AND MAXIMUM THROUGHPUT IN THE DQDB NETWORK y Shun Yan Cheung Department of Mathematics and Computer Science Emory University, Atlanta, GA 30322, U.S.A. ABSTRACT

More information

Using Timestamps to Track Causal Dependencies

Using Timestamps to Track Causal Dependencies Using Timestamps to Track Causal Dependencies J. A. David McWha Dept. of Computer Science, University of Waikato, Private Bag 315, Hamilton jadm@cs.waikato.ac.nz ABSTRACT As computer architectures speculate

More information

Analysing Probabilistically Constrained Optimism

Analysing Probabilistically Constrained Optimism Analysing Probabilistically Constrained Optimism Michael Lees and Brian Logan School of Computer Science & IT University of Nottingham UK {mhl,bsl}@cs.nott.ac.uk Dan Chen, Ton Oguara and Georgios Theodoropoulos

More information

ON THE SCALABILITY AND DYNAMIC LOAD BALANCING OF PARALLEL VERILOG SIMULATIONS. Sina Meraji Wei Zhang Carl Tropper

ON THE SCALABILITY AND DYNAMIC LOAD BALANCING OF PARALLEL VERILOG SIMULATIONS. Sina Meraji Wei Zhang Carl Tropper Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. ON THE SCALABILITY AND DYNAMIC LOAD BALANCING OF PARALLEL VERILOG SIMULATIONS

More information

Compiler Support for Software-Based Cache Partitioning. Frank Mueller. Humboldt-Universitat zu Berlin. Institut fur Informatik. Unter den Linden 6

Compiler Support for Software-Based Cache Partitioning. Frank Mueller. Humboldt-Universitat zu Berlin. Institut fur Informatik. Unter den Linden 6 ACM SIGPLAN Workshop on Languages, Compilers and Tools for Real-Time Systems, La Jolla, California, June 1995. Compiler Support for Software-Based Cache Partitioning Frank Mueller Humboldt-Universitat

More information

Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi

Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi Incorporating the Controller Eects During Register Transfer Level Synthesis Champaka Ramachandran and Fadi J. Kurdahi Department of Electrical & Computer Engineering, University of California, Irvine,

More information

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst An Evaluation of Information Retrieval Accuracy with Simulated OCR Output W.B. Croft y, S.M. Harding y, K. Taghva z, and J. Borsack z y Computer Science Department University of Massachusetts, Amherst

More information

Sparse Matrix Operations on Multi-core Architectures

Sparse Matrix Operations on Multi-core Architectures Sparse Matrix Operations on Multi-core Architectures Carsten Trinitis 1, Tilman Küstner 1, Josef Weidendorfer 1, and Jasmin Smajic 2 1 Lehrstuhl für Rechnertechnik und Rechnerorganisation Institut für

More information

JWarp: a Java library for parallel discrete-event simulations

JWarp: a Java library for parallel discrete-event simulations CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper.,Vol.10(11 13), 999 1005 (1998) JWarp: a Java library for parallel discrete-event simulations PEDRO BIZARRO,LUÍS M. SILVA AND JOÃO GABRIEL

More information

PARALLEL MULTI-DELAY SIMULATION

PARALLEL MULTI-DELAY SIMULATION PARALLEL MULTI-DELAY SIMULATION Yun Sik Lee Peter M. Maurer Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 CATEGORY: 7 - Discrete Simulation PARALLEL MULTI-DELAY

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Andrew Davenport and Edward Tsang. fdaveat,edwardgessex.ac.uk. mostly soluble problems and regions of overconstrained, mostly insoluble problems as

Andrew Davenport and Edward Tsang. fdaveat,edwardgessex.ac.uk. mostly soluble problems and regions of overconstrained, mostly insoluble problems as An empirical investigation into the exceptionally hard problems Andrew Davenport and Edward Tsang Department of Computer Science, University of Essex, Colchester, Essex CO SQ, United Kingdom. fdaveat,edwardgessex.ac.uk

More information

A Linear-Time Heuristic for Improving Network Partitions

A Linear-Time Heuristic for Improving Network Partitions A Linear-Time Heuristic for Improving Network Partitions ECE 556 Project Report Josh Brauer Introduction The Fiduccia-Matteyses min-cut heuristic provides an efficient solution to the problem of separating

More information

On Computing Minimum Size Prime Implicants

On Computing Minimum Size Prime Implicants On Computing Minimum Size Prime Implicants João P. Marques Silva Cadence European Laboratories / IST-INESC Lisbon, Portugal jpms@inesc.pt Abstract In this paper we describe a new model and algorithm for

More information

Predictive Thread-to-Core Assignment on a Heterogeneous Multi-core Processor*

Predictive Thread-to-Core Assignment on a Heterogeneous Multi-core Processor* Predictive Thread-to-Core Assignment on a Heterogeneous Multi-core Processor* Tyler Viswanath Krishnamurthy, and Hridesh Laboratory for Software Design Department of Computer Science Iowa State University

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines B. B. Zhou, R. P. Brent and A. Tridgell Computer Sciences Laboratory The Australian National University Canberra,

More information

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid

More information

Predicting the performance of synchronous discrete event simulation systems

Predicting the performance of synchronous discrete event simulation systems Predicting the performance of synchronous discrete event simulation systems Jinsheng Xu and Moon Jung Chung Department of Computer Science Michigan State University {xujinshe,chung}@cse.msu.edu ABSTRACT

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

execution host commd

execution host commd Batch Queuing and Resource Management for Applications in a Network of Workstations Ursula Maier, Georg Stellner, Ivan Zoraja Lehrstuhl fur Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut fur

More information

USING GENETIC ALGORITHMS TO LIMIT THE OPTIMISM IN TIME WARP. Jun Wang Carl Tropper

USING GENETIC ALGORITHMS TO LIMIT THE OPTIMISM IN TIME WARP. Jun Wang Carl Tropper Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. USING GENETIC ALGORITHMS TO LIMIT THE OPTIMISM IN TIME WARP Jun Wang Carl

More information

University of Maryland. fzzj, basili, Empirical studies (Desurvire, 1994) (Jeries, Miller, USABILITY INSPECTION

University of Maryland. fzzj, basili, Empirical studies (Desurvire, 1994) (Jeries, Miller, USABILITY INSPECTION AN EMPIRICAL STUDY OF PERSPECTIVE-BASED USABILITY INSPECTION Zhijun Zhang, Victor Basili, and Ben Shneiderman Department of Computer Science University of Maryland College Park, MD 20742, USA fzzj, basili,

More information

TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation

TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation Charles Selvidge, Anant Agarwal, Matt Dahl, Jonathan Babb Virtual Machine Works, Inc. 1 Kendall Sq. Building

More information

Recovering from Main-Memory Lapses. H.V. Jagadish Avi Silberschatz S. Sudarshan. AT&T Bell Labs. 600 Mountain Ave., Murray Hill, NJ 07974

Recovering from Main-Memory Lapses. H.V. Jagadish Avi Silberschatz S. Sudarshan. AT&T Bell Labs. 600 Mountain Ave., Murray Hill, NJ 07974 Recovering from Main-Memory Lapses H.V. Jagadish Avi Silberschatz S. Sudarshan AT&T Bell Labs. 600 Mountain Ave., Murray Hill, NJ 07974 fjag,silber,sudarshag@allegra.att.com Abstract Recovery activities,

More information

Parallel Clustering on a Unidirectional Ring. Gunter Rudolph 1. University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund

Parallel Clustering on a Unidirectional Ring. Gunter Rudolph 1. University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund Parallel Clustering on a Unidirectional Ring Gunter Rudolph 1 University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund 1. Introduction Abstract. In this paper a parallel version

More information

PARALLEL LOGIC SIMULATION OF MILLION-GATE VLSI CIRCUITS

PARALLEL LOGIC SIMULATION OF MILLION-GATE VLSI CIRCUITS PARALLEL LOGIC SIMULATION OF MILLION-GATE VLSI CIRCUITS By Lijuan Zhu A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree

More information

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System Managing Checkpoints for Parallel Programs Jim Pruyne and Miron Livny Department of Computer Sciences University of Wisconsin{Madison fpruyne, mirong@cs.wisc.edu Abstract Checkpointing is a valuable tool

More information

TECHNICAL RESEARCH REPORT

TECHNICAL RESEARCH REPORT TECHNICAL RESEARCH REPORT A Resource Reservation Scheme for Synchronized Distributed Multimedia Sessions by W. Zhao, S.K. Tripathi T.R. 97-14 ISR INSTITUTE FOR SYSTEMS RESEARCH Sponsored by the National

More information

The Impact of Lookahead on the Performance of Conservative Distributed Simulation

The Impact of Lookahead on the Performance of Conservative Distributed Simulation The Impact of Lookahead on the Performance of Conservative Distributed Simulation Bruno R Preiss Wayne M Loucks Department of Electrical and Computer Engineering University of Waterloo, Waterloo, Ontario,

More information

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL Jun Sun, Yasushi Shinjo and Kozo Itano Institute of Information Sciences and Electronics University of Tsukuba Tsukuba,

More information