This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July , IEEE Computer

Size: px

Start display at page:

Download "This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July , IEEE Computer"

Horatio Jackson
6 years ago
Views:

1 This article appeared in Proc. 7th IEEE Symposium on Computers and Communications, Taormina/Giardini Naxos, Italy, July , IEEE Computer Society. Software Supports for Preemptive Rollback in Optimistic Parallel Simulation on Myrinet Clusters Francesco Quaglia and Andrea Santoro DIS, Universita di Roma \La Sapienza" Via Salaria 113, Roma, Italy Abstract In this paper we present a communication layer for Myrinet based clusters, designed to eciently support preemptive rollback operations in optimistic parallel simulation. Beyond standard low latency message delivery functionalities, this layer also embeds functionalities for allowing the overlying simulation application to eciently track whether an incoming message will actually produce causality inconsistency of the currently executed simulation event upon its receipt at the application level. Exploiting these functionalities, awareness of the inconsistency precedes the message receipt at the application level, thus allowing timely event execution interruption for activating rollback procedures. Experimental results on a standard simulation benchmark show that the layer we implement allows a strong reduction of the rollback overhead which, in its turn, yields strong performance improvements (up to 33%), especially in case of large parallelism in the simulation model execution. 1 Introduction In parallel discrete-event simulation, distinct parts of the system to be simulated are modeled by distinct Logical Processes (LPs), with their own local simulation clocks [4]. LPs execute sequences of simulation events, and each event execution possibly schedules new events to be executed by any LP at later simulation time. The notication of scheduled events among distinct LPs takes place through the exchange of messages carrying the content and the occurrence time, namely the timestamp, of the event. In order to ensure correct simulation results, synchronization mechanisms are used to maintain a non-decreasing timestamp order for the execution of events at each LP. Optimistic synchronization [6] allows any LP to execute events unless its pending event set is empty, and uses a checkpoint-based rollback to recover from timestamp order violations, namely causality inconsistencies. A rollback recovers the state of the LP to its value immediately prior the violation. While rolling back, the LP undoes the eects of the events scheduled during the rolled back portion of the simulation. This is done by sending an antimessage for each event that must be undone. Upon the receipt of an antimessage that cancels an event already executed, the recipient LP rolls back as well. This is also referred to as \cascading rollback". In current simulation software technology, event execution at an LP is typically handled as an atomic action, and check for incoming messages/antimessages, which might reveal causality inconsistency, is performed only in between consecutive event executions ( 1 ). This means that rollback operations for an LP, if required, do never preempt any in progress event execution activity [4]. Such a preemption exhibits, however, the following three advantages: (i) The simulation application does not waste CPU time to complete the execution of a causally inconsistent event that must be rolled back. (ii) Interruption of the event execution might avoid sending some event notication messages to other LPs. Those messages, if sent, should be eventually revoked through the corresponding antimessages. Therefore, avoiding sending them might achieve a reduction of the communication overhead. Also, a reduction of the amount of messages to be revoked might reduce the likelihood of cascading rollback. (iii) Interruption of the event execution allows anticipating the sending of antimessages required by rollback operations. Therefore we get a reduction of the likelihood that, upon the antimessage receipt, the corresponding message was already processed by the recipient LP. This additionally contributes to reduce the likelihood of cascading rollback. The reason why current simulation software technology does not employ preemptive rollback is the inadequacy of general purpose communication layers functionalities. Indeed, using any of those layers, the only chance an LP has to become aware, during event execution, of any message/antimessage revealing a causality inconsistency, is to run the event routine and the message receipt module in time inter-leaved mode. This is because the layer itself is not aware of the content of transmitted data, therefore causality inconsistency ver- 1 In the standard approach to optimistic parallel simulation, in case multiple LPs are hosted by the same machine, the schedule sequence of event executions is determined on the basis of the lowest-timestamp-rst rule, which means that the LP scheduled for event execution is always the one having, in its pending event set, the non-executed event with the lowest timestamp. According to this scheduling rule, causality inconsistencies can occur only due to interprocessor communication.

2 ication must be performed at the simulation application level. However, this solution shows several drawbacks making it not viable. First, time inter-leaving might cause additional execution costs (i.e. application level context switch costs). More important, event execution time is stretched thus causing delay in sending notication messages for newly scheduled events, which possibly increases the likelihood of timestamp order violations at the recipient LPs [2, 12]. At worst such an increase might give rise to rollback thrashing in the parallel execution. In this paper we present software supports for preemptive rollback in case of optimistic parallel simulations on Myrinet clusters. These supports allow the implementation of preemptive rollback operations with no inter-leaving between event routine and message receipt at the application level. Actually, Myrinet network cards, as well as other types of network cards of the last generation [7], are equipped with a programmable processor, typically used only for message transfer purposes. We employ that same processor to perform \on-the-y" verication on incoming messages/antimessages, so as to ascertain whether the LP being executed is causally consistent. Then, the simulation application software reads the verication outcome at negligible cost by means of calls to a light function belonging to the API associated with the communication layer. We test the eectiveness of these software supports by means of an empirical study on a standard simulation benchmark. The data show that rollback costs can be denitely reduced, with consequent increase (up to 33%) of the speed of the parallel execution. The remainder of this paper is organized as follows. In Section 2 we describe the software supports for preemptive rollback we have developed. Performance results are reported in Section 3. 2 The Software Supports The software supports we present are an extension of a classical message passing layer for Myrinet clusters. To make clear the extension design, we preliminary report a brief description of the Myrinet architecture and of a classical layer organization. Then the extension is presented in detail. 2.1 Myrinet Overview A Myrinet network consists of a switch based on wormhole technology and of interface cards in between the switch and the hosts, which are based on the LANai chip [8, 9, 10]. At high level, the interface card architecture can be schematized as shown in Figure 1. Specically, it is a programmable communication device equipped with: (a) An on-board programmable RISC Processor (RP). (b) A RAM bank memory (Internal Memory, namely IM), accessible by RP, which can be mapped into the host address space. (c) A packet interface towards the switch, associated with Send/Receive DMA engines used for IM/packetinterface data transfer and vice versa. (d) A PCI bridge Switch Messages RISC Processor (RP) Packet Interface Send DMA Receive DMA Myrinet Card Internal Bus (IB) EBUS DMA RAM Bank (IM) PCI Bridge Host Figure 1. Myrinet Card Architecture. Host Memory Receive Queue towards the host. (e) A DMA engine, called EBUS DMA (External Bus DMA) able to transfer data from the IM to the host main memory and vice versa. All the components of the architecture are interconnected through an Internal Bus (IB) and the three DMA engines are programmable by RP. As a nal observation, RP cannot access the host memory directly. Nonetheless, the program run by RP, typically known as Control Program (CP), can activate the EBUS DMA for performing data transfer operations to/from the host memory. As a common choice to fast speed messaging layers for Myrinet, see for example [11], messages incoming from the switch are temporarily buered into the IM, with data transfer between the packet interface and the IM taking place through the Receive DMA. The messages are then transferred into the receive queue, located onto host memory, through the EBUS DMA (see the directed dotted line in Figure 1). Once transferred into the receive queue, a message is received by the application program very eciently by simply performing a memcpy() of the message content into a proper buer in the application address space. In case multiple messages stored in contiguous slots of IM must be transferred into the receive queue, a single EBUS DMA transfer is used for the whole set of messages. This technique is referred to as \block-dma". Another common choice consists of performing a send operation according to the \zero-copy" method, which means that the content of the message to be sent is directly copied into the IM (with no intermediate buering into host memory). Then the message is transferred onto the network through the Send DMA. The responsibility to program the three DMA engines anytime there is the need for supporting a given message transfer operation pertains to the CP run by RP. The main loop of the CP for our implementation is as follows: 1. While (1){ 2. if (message needs to be sent) activate_send_dma(); 3. if (message needs to be received) activate_receive_dma(); 4. if (Send DMA completed) complete_send(); 5. if (Receive DMA completed) complete_receive(); 6. if (block-dma needed) block_dma(); 7. }

3 2.2 Layer Extension for Preemptive Rollback We decided to extend the classical layer functionalities by actively employing the potential of RP. The aim of the extension is to introduce an on-the-y causality inconsistency tracking mechanism as schematized in Figure 2. The idea is to identify, in the incoming data stream, all the messages/antimessages destined to the LP currently executing a simulation event, if any. Then causality inconsistency verication is performed by comparing the timestamp of an identied message/antimessage with the current simulation clock of that LP. Notication of causal inconsistent execution at the application level takes place by setting a ag whose value can be checked by the LP at its own pace during event execution. Note that identication and verication performed at the level of the CP run by RP can take place without altering the order in which messages incoming from the switch are inserted into the receive queue located onto host memory. As in our implementation, this order is typically FCFS, that together with the FCFS property guaranteed by transmission through the Myrinet switch, ensures point to point message delivery reecting the message send order. Actually identication and verication can take place just after the incoming message/antimessage has been buered into the IM on-board of the card (recall buering is performed by the Receive DMA). Note that identication and verication performed at that point do not require structural modications to the CP, and do not require special handling, such as an additional intermediate buering of the message/antimessage. incoming message/antimessage Myrinet Card flag causality inconsistency verification polling event execution Host interruption (in case a violation is revealed) receive queue housekeeping operations Figure 2. Causality Inconsistency Tracking. In our implementation, the function performing identication and verication has been called peek(), and the main loop of the CP has been slightly modied as shown below in order to embed a call to this function as soon as buering of a message/antimessage into the IM is completed (see lines 5 to 7 of the loop): 1. While (1){ 2. if (message needs to be sent) activate_send_dma(); 3. if (message needs to be received) activate_receive_dma(); 4. if (Send DMA completed) complete_send(); 5. if (Receive DMA completed){ 6. complete_receive(); 7. peek(); 8. } 9. if (block-dma needed) block_dma(); 10.} The function peek() needs the following information to perform its task: (i) The identier of the LP, if any, Time currently executing a simulation event. (ii) The current simulation clock of this LP. This information must be made available into the IM since, as pointed out before, the CP cannot access host memory. To this purpose we have introduced, in the API associated with the layer, the function set tracking(int LP id, double simulation clock), which writes both the LP identi- er and its simulation clock to a proper buer located into the IM on-board of the card (recall IM is mapped in the host address space so that the host can perform read/write operations on it). Then the function peek() determines the occurrence of a causality inconsistency on the running LP by simply comparing the current LP simulation clock and the timestamp of any incoming message/antimessage destined to that LP. If the timestamp value is less than the simulation clock value, the current event execution is not causally consistent. Beyond writing the LP identier and its simulation clock to a proper buer in the IM, set tracking() has also the responsibility to enable causality inconsistency tracking performed at the level of the CP. This is done by setting a ag, namely control flag, implemented as a word (4 bytes) located into the IM ( 2 ). Flag setting indicates to the function peek() that causality inconsistency tracking is currently required by the application level. peek() checks the value of control flag as rst action and avoids performing causality inconsistency tracking in case the ag is not set. In other words, set tracking() acts as an initializer/activator for the causality inconsistency tracking mechanism. This function should be called right after the simulation clock of the LP scheduled for running is updated to the timestamp of the event that is going to be executed. In standard simulation code, this happens as soon as the LP enters the event routine. (A more detailed description of the usage, at the simulation application level, of the whole set of preemptive rollback support functions belonging to the API will be presented in Section 2.3.) Disabling causality inconsistency tracking, i.e. resetting control flag, can be performed through the function reset tracking(), which we have introduced in the API associated with the layer. This function should be called at the application level prior the start of any period in which causality inconsistency tracking is not required. The reason why we have introduced control flag is twofold. First, disabling causality inconsistency tracking whenever not required by the application level increases the responsiveness of the CP in programming/controlling the DMA engines on-board of the card. Second, and more important, critical races between the software run at the host side and the CP run by RP might occur, causing errors in the outcome of the causality inconsistency tracking mechanism. Actu- 2 We have used a word instead of a single byte for control flag since Myrinet cards internal circuitry is optimized for aligned words access.

4 set tracking() in charge of this transition OFF ON CP in charge of this transition reset tracking() in charge of this transition SHUT-DOWN Figure 3. Transition State Diagram. ally, adequate management of control flag as a three state ag (see Figure 3) prevents those races. If a causality inconsistency is detected, the function peek() sets to TRUE a ag, namely inconsistency flag, implemented as a word located into the IM. This ag is reset to FALSE by set tracking() upon its call. The application software checks the value of inconsistency flag through the function check causality inconsistency(), which we have introduced into the API. This function simply returns the ag value, namely TRUE in case a causality inconsistency has been revealed, FALSE otherwise. In other words, the function check causality inconsistency() can be used at the simulation application level for implementing an ecient polling based approach to the detection of causality inconsistency of the running LP. 2.3 Usage at the Application Level Preemptive rollback support functions are expected to be employed as follows. As soon as the simulation clock of the LP is moved to the event to be executed (this typically happens at the event execution starting) the function set tracking() should be called. In case this function is called prior to updating the LP simulation clock, causality inconsistency tracking might result less eective since it cannot reveal a timestamp order violation occurring in the simulation time interval between the not yet updated value of the simulation clock and the timestamp of the event currently executed by the LP. Just before the event execution is completed, the function reset traking() should be called to disable causality inconsistency tracking. The two functions above initialize/enable and disable causality inconsistency tracking. In order to actually read the tracking outcome, periodic calls to check causality inconsistency() should be issued. In case one of these calls reveals a causality inconsistency, the event execution should be interrupted. As a last action before interruption, the function reset tracking() should be called to disable again the causality inconsistency tracking mechanism until a new call to set tracking() is issued by any LP that will be scheduled for event execution. We have measured the time required to execute the function check causality inconsistency() in case of a Pentium II 300 MHz with LINUX (kernel version ) equipped with an M2M-PCI32C Myrinet card ( 3 ), and the measurement outcome is The data we report in Section 3 refer to experiments exe- secs. Considering that parallel discrete event technology is typically adopted for simulation models with event granularity (i.e. event execution time) ranging from several tens or hundreds of microseconds up to some milliseconds or more on current conventional architectures, the overhead due to polling performed by check causality inconsistency() is negligible in practice even if several calls to this function are performed within a single event execution. As respect to the strategy for placing these calls within the event code, we note that revealing a causality inconsistency right before sending out any notication message for a new event is likely to amplify the benets from preemptive rollback. From the point of view of pure communication, the Myrinet layer we have extended with previously described functionalities is characterized by the following set of API functions: send(int msg type, int machine id, void *message) and receive(int msg type, void *message, int *machine id). Both send and receive functions treat the message content as a simple sequence of bytes, therefore the overlaying application is in charge of dening an adequate structure for the content itself. On the other hand, the layer extension we have presented accesses the content (i.e. the identier of the destination LP and the timestamp) at the level of the CP through the function peek(). To make this access working properly, any message/antimessage must comply with the following format: (i) the rst four bytes must contain the LP id of the destination LP, expressed as an integer; (ii) the following eight bytes must contain the timestamp, expressed as a double precision oating point. The rest of the message is of no concern to the CP. 3 Performance Data In this section we report the results of an empirical study on the eectiveness of the software supports for preemptive rollback we have implemented. The experiments were all performed on a cluster of 8 Pentium II 300 MHz (128 Mbytes RAM Kbytes second level cache). All the PCs of the cluster run LINUX (kernel version ) and are equipped with M2M-PCI32C Myrinet cards. We have used a classical parameterized simulation benchmark, namely the PHOLD model [3]. This benchmark consists of a xed number of LPs, and of a constant number of ctitious simulation events circulating among the LPs. The execution of an event at an LP has the only eect to schedule a new event to be executed by any LP, with a timestamp increment selected according to some stochastic distribution. We have chosen this simulation benchmark for two main reasons: (i) its parameters (e.g. routing of events, timestamp increment distributions, event granularity, etc.) can be easily modied, (ii) it is the most used benchmark for testing the eectiveness of opticuted on a cluster of this type of PCs.

5 mizations in support of parallel discrete event simulation. [1, 13, 14, 15, 16, 17, 18, 19, 20]. In our experiments we have considered a PHOLD model with an exponential distribution for the timestamp increment, with mean 10 simulation time units, and we have xed the amount of initial ctitious simulation events at 5 per LP (a similar workload has been used, as an example, in the experimental study in [19]). Also, the event execution time has been set at about 100 microseconds, which is an intermediate value for parallel discrete event applications. As in the common approach [19], this has been done by structuring the event code as a simple busy loop. The independent parameter in the study is the number of LPs per machine, which has been varied from 1 to 8. ER (preemtive roll.) / ER (non-preemtive roll.) Figure 4. ER vs (Ratio Between Preemptive Rollback and Non- Preemptive Rollback). EFF (%) non-preemptive rollback preemptive rollback Figure 5. EFF vs. F Figure 6. F vs. COPCE non-preemptive rollback preemptive rollback Figure 7. COPCE vs. We report measures related to the following parameters. The event rate (ER), that is, the number of committed simulation events per second. This parameter indicates how fast is the simulation execution, it is therefore representative of the achieved performance. The eciency (EFF), that is, the ratio between the number of committed simulation events and the total number of executed simulation events (committed plus rolled back). This parameter is an indicator of the percentage of productive simulation work performed. The communication overhead per committed event (COPCE), that is the number of antimessages plus the number of revoked messages per committed event (simulation messages/antimessages that do not need to be transferred over the network, i.e. those destined to local LPs, are not included in the measurement). This parameter is an indicator of the additional communication cost, which must be paid due to the rollback mechanism, for each productive simulation event execution. We have measured the previous parameters both for the case of a typical execution with no preemptive rollback and for the case of preemptive rollback based on periodic calls (polling) to check causality inconsistency() embedded within the previously mentioned busy loop characterizing the event execution. Polling is done each 10 microseconds, therefore, recalling that a single call to check causality inconsistency() takes about 0.5 sec on this experimental platform (see Section 2.3), we get an overhead on the event execution of about 5%. For the case of preemptive rollback we have also measured the fraction F of rolled back events that have been also preempted during their execution. This parameter indicates the eectiveness of preemptive rollback in revealing causal inconsistencies just during event execution. Each reported value results as the average over 10 runs, all done with dierent random seeds for the stochastic timestamp increment. At least committed events were simulated in each run. In Figure 4, the ratio between the ER achieved in case of preemptive rollback and the ER achieved with classical non-preemptive rollback is plotted. The data show that when the degree of parallelism in the simulation model execution is high (i.e. 1-2 LPs per machine), preemptive rollback allows an execution speed which is between 1.33 and 1.10 times the one achievable with non-preemptive rollback, with a gain between 33% and 10%. Then the gain disappears when the degree of parallelism gets lower (i.e. 4-8 LPs) per machine. This is an expected behavior when looking at the plots of EFF in Figure 5. Specically, EFF in the order of 85-90%, achieved for the case of 4-8 LPs per machine, indicates that the likelihood of causal inconsistent execution is denitely low ( 4 ). As a consequence, 4 Increase in the eciency for the case of limited degree of parallelism derives from the fact that the simulation clocks of the LPs typically exhibit less divergence as compared to high

6 there is a reduced likelihood that, upon the arrival of a message/antimessage at the destination host, the currently executed event, if any, needs to be interrupted due to a causal inconsistency. Therefore, the usage of preemptive rollback actually leads to parallel execution dynamics which are quite similar to those achieved with classical non-preemptive rollback. As an empirical support to this deduction, the plot for F in Figure 6 clearly indicates that the fraction of preempted events among those rolled back becomes minimal for the 4-8 LPs per machine case. However, this is not a aw of the preemptive technique since, for parallel executions exhibiting high eciency, dierent dynamics between preemptive and non-preemptive rollback are anyway expected to produce very similar performance. This is due to the fact that the cost of rollback is limited, therefore, even if preemptive rollback would achieve rollback cost reductions, the nal impact on performance is expected to be negligible in practice. As an additional observation, by the plots in Figure 5 we note that preemptive and non-preemptive rollback exhibit in practice the same values for EFF. Instead, by the plots in Figure 7, for the case of high degree of parallelism in the simulation model execution, preemptive rollback achieves a strong reduction of COPCE (up to 20%). This is an indication that the gain in the execution speed is not due to variations in the amount of rollback, but to the reduced costs associated with any single rollback operation. (Note that the reduction of the rollback cost achieved by preemptive rollback derives not only from the reduced values of COPCE, but also from the reduced waste of CPU time due to timely interruption of a causally inconsistent event execution.) Additional advantages can be expected whenever preemptive rollback also triggers a reduction in the amount of rollback, as in its potential. Finally, we note that high degree of parallelism in the simulation model execution is typical of the recent parallel simulation approach based on federations of existing sequential simulators (in this solution we get a situation similar to the 1 LP per machine execution [5] since each LP can be considered, in practice, as a sequential discrete event simulator). On the basis of previous results, adopting optimistic synchronization mechanism among the federated simulators could take strong advantages from usage of the software supports for preemptive rollback we have developed. 4 Summary In this paper we have presented software supports for the implementation of preemptive rollback operations relying on timely interruption of the execution of a causally inconsistent simulation event. These supports have been developed for optimistic parallel discrete event simulators running on top of Myrinet based architectures, and derive from an extension of a classical Myrinet message-passing layer. degree of parallelism, which yields reduced amounts of causality violations. We have discussed how to use the preemptive rollback functions at the simulation application level and we have also reported the results of an empirical study on a standard simulation benchmark, which demonstrate the eectiveness of the presented extension. Specically, these results show that the performance can be increased up to 33% in case of high degree of parallelism in the simulation model execution. As discussed, the extension should be particularly useful for the recent approach to parallel discrete event simulation based on federated optimistic execution of multiple sequential discrete event simulators, which typically exhibits high degree of execution parallelism. References [1] D. Bruce, \The Treatment of State in Optimistic Systems", Proc. 9-th Workshop on Parallel and Distributed Simulation (PADS'95), pp.40-49, June [2] C. D. Carothers, R. Fujimoto and P. England, \Eect of Communication Overheads on Time Warp Performance: An experimental Study", Proc. 8th Workshop on Parallel and Distributed Simulation (PADS'94), pp , July [3] R.M. Fujimoto, \Performance of Time Warp Under Synthetic Workloads", Proc. Multiconf. Distributed Simulation, Vol.22, No.1, January [4] R.M. Fujimoto, \Parallel Discrete Event Simulation", Communications of ACM, Vol.33, No.10, pp.30-53, [5] R.M. Fujimoto, \Exploiting Temporal Uncertainty in Parallel and Distributed Simulations", Proc. 13-th Workshop on Parallel and Distributed Simulation (PADS'99), pp.46-53, May [6] D.R. Jeerson, \Virtual Time", ACM Transactions on Programming Languages and Systems, Vol.7, No.3, pp , [7] \Intelligent Ethernet Interface Solutions", [8] MYRICOM, \LANai 4", Draft, February [9] MYRICOM, \LANai 7", Draft, June Available at [10] MYRICOM, \LANai 9", May Available at [11] S. Pakin, M. Lauria and A. Chen, \High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet", Proc. Supercomputing'95, [12] B.R. Preiss, W.M. Loucks and D. MacIntyre, \Eects of the Checkpoint Interval on Time and Space in Time Warp", ACM Transactions on Modeling and Computer Simulation, Vol. 4, No. 3, pp , [13] F.Quaglia, \Event History Based Sparse State Saving in Time Warp", Proc. 12-th Workshop on Parallel and Distributed Simulation (PADS'98), pp.72-79, May [14] F. Quaglia, \A Cost Model for Selecting Checkpoint Positions in Time Warp Parallel Simulation", IEEE Transactions on Parallel and Distributed Systems, Vol.12, No.4, pp , April [15] F. Quaglia and V. Cortellessa, \Grain Sensitive Event Scheduling in Time Warp Parallel Discrete Event Simulation", Proc. 14th Workshop on Parallel and Distributed Simulation (PADS'00), pp , May [16] R. Ronngren and R. Ayani, \Adaptive Checkpointing in Time Warp", Proc. 8-th Workshop on Parallel and Distributed Simulation (PADS'94), pp , May [17] T.K. Som and R.G. Sargent, \A Probabilistic Event Scheduling Policy for Optimistic Parallel Discrete Event Simulation", Proc. 12-th Workshop on Parallel and Distributed Simulation (PADS'98), pp.56-63, May [18] S. Skold and R. Ronngren, \Event Sensitive State Saving in Time Warp Parallel Discrete Event Simulations", Proc Winter Simulation Conference, December [19] S. Srinivasan and P.F. Reynolds Jr., \Elastic Time", ACM Transactions on Modeling and Computer Simulation, Vol.8, No.2, pp , [20] D. West and K. Panesar, \Automatic Incremental State Saving", Proc. 10-th Workshop on Parallel and Distributed Simulation (PADS'96), pp.78-85, May 1996.

χ=5 virtual time state LVT entirely saved state partially saved state χ=5 ν=2 virtual time state LVT entirely saved partially saved unsaved state

χ=5 virtual time state LVT entirely saved state partially saved state χ=5 ν=2 virtual time state LVT entirely saved partially saved unsaved state ROLLBACK-BASED PARALLEL DISCRETE EVENT SIMULATION BY USING HYBRID STATE SAVING Francesco Quaglia Dipartimento di Informatica e Sistemistica, Universita di Roma "La Sapienza" Via Salaria 113, 00198 Roma,