Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications

Size: px

Start display at page:

Download "Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications"

Aron Sullivan
5 years ago
Views:

1 Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications L. Zhang, M. Parashar, E. Gallicchio, R. Levy TASSL & BIOMAPS Rutgers University ICPP 06, Columbus, OH, Aug. 16, 2006

2 Outline Introduction Problem Statement/Parallel Replica Exchange Salsa: Scalable Asynchronous Replica Exchange Experimental Evaluation Related Work Conclusion Future Work

3 Motivation Sequencing of the human genome and advances in structural genomics are resulting in an explosion in available high resolution protein structures Large-scale (parallel) molecular simulations of protein structural changes and drug binding to proteins can have significant impact Simulations depends on efficient algorithms for searching over the rough energy landscapes that govern protein folding and drug binding

4 The Replica Exchange Algorithm A powerful sampling algorithm that preserves canonical distributions Allows for efficient crossing of high energy barriers that separate thermodynamically stable states Can significantly reduce sampling times as compared to formulations based on constant temperatures Algorithm overview Several copies, or replicas, of the system are simulated in parallel at different temperatures using walkers Walkers occasionally swap temperatures to allow them to bypass enthalpic barriers by moving to a high temperature Exchanges governed by a probability condition to ensure detailed balance Replica exchange can significantly impact the fields of structural biology and drug design structure based drug design associated with protein misfolding, for example, structure based drug design and binding affinity optimization molecular basis of human diseases associated with protein misfolding

5 Parallel Replica Exchange General formulation requires dynamic and complex coordination and communication patterns between the walkers Pair-wise and asynchronous Dependent on the current state (temperature, energy) of replicas, which are changing dynamically Implementation based on commonly used parallel programming frameworks is challenging Message passing frameworks, e.g. MPI, require matching sends and receives to be explicitly defined for each interaction Implementations use restrictive formulations using synchronous exchanges Exchanges between neighboring temperatures only

6 Salsa: A Framework for Scalable Asynchronous Replica Exchange Provides an abtraction of a semantically specialized virtual shared space Scalable communication and interaction substrate based on the tuple-space model Supports code coupling, parallel data redistribution, multiblock coupling, asynchronous and decoupled interactions

Salsa Overview Architecture Directory Layer Presents the shared temperature space abstraction Provides a rendezvous substrate for exchanges Communication Layer Supports high-throughput, lowlatency

7 Salsa Overview Architecture Directory Layer Presents the shared temperature space abstraction Provides a rendezvous substrate for exchanges Communication Layer Supports high-throughput, lowlatency p2p communications Implementation C library using multithreading Salsa daemon on each processors Customized communication layer using sockets Complements other programming systems MPI, PVM, OpenMP

8 Salsa Programming Interface Operation init(gbl-temp-range) Description Initialize the Seine-Salsa shared space post(exch-temp-lower-bound, Post a temperature range of exchange interest to the shared exch-temp-upper-bound) space get(?temp, engy) Get the exchanged temperature from the space. This is a blocking call and the calling process blocks until a matching temperature is available. The retrieved temperature is removed from the space. getp(?temp, engy) Get the exchanged temperature from the space. This is a non-blocking call and the calling process continues if no matching temperature is available. The retrieved temperature is removed from the space.

9 Salsa-based Replica Exchange Integrated within the IMPACT molecular mechanics program Binding of ligands to the cytochrome P450 class of enzymes responsible for cellular detoxification and drug metabolism Misfolding of naturally occurring and mutated form of protein synuclein associated with Parkinsons disease. Scalably support general (non nearest-neighbor) temperature exchanges ensuring proper mixing of temperatures across the walkers Psuedo-code Post temperature range of interest Negotiate exchange Perform exchange if (seineinitflag.eq. 0) then call init_salsa(global_temperature_lowbound, global_temperature_upperbound) seineinitflag = 1 timestamp = 0 else timestamp = timestamp + 1 endif if (timestamp.eq. (timestamp/exchange_rate)*exchage_rate) then call post(tempt(nspec+1) - GUESSRANGE, tempt(nspec+1) + GUESSRANGE) endif call getp(newtemp, epot, accepted)

10 Salsa Operation Walkers post temperature range of interest to the Salsa shared space using post A request is routed to all Salsa daemons whose index ranges overlap with the posted range On receiving a remote post request, the daemon first checks its storage for potential exchange partners If a candidate exists (say walker2), the requesting walker (say walker1) is notified Otherwise, the incoming request is stored

11 Salsa Operation Walkers negotiate exchange Walkers selects an exchange partner from one or more potential candidates Partners must mutually agree to exchange data using a two-way handshake protocol A walker can be in one of three states free -- the walker is available for an exchange only if it is in the free state. onhold -- the walkers has already agreed to exchange with another walker but exchange has not yet occurred. finished -- the walker has already finished an exchange with another walker and its posted interest to exchange is no longer valid.

12 Salsa Operation Walkers negotiate exchange handshake (contd.) Walker1 contact walker2 with desire to exchange Walker2 checks its local state ( free, onhold, or finished ) If walker2 is free it will respond positively to walker1 The two walkers confirm their intent to exchange data with each other and change their state to onhold If walker2 responds negatively, walker1 attempts to negotiate with the next walker in its list of candidates If the walker cannot find an exchange partner in its list of candidates, it just gives up and continues simulation with its current data until the next exchange cycle.

13 Salsa Operation Perform exchange using the getp operator Walker1 sends its current data (e.g. temperature and energy) to its potential partner (i.e. walker2) Walker2 determines whether they can exchange based on data it receives and its own data This step is necessary since exchanges occur asynchronously and in parallel with the computation, and a walkers data (i.e., energy) may have changed since it posted its exchange interest If walker2 decides to continue with the exchange, it will notify walker1 send its current local data to complete the exchange Exchanges are between a pair of walkers and multiple exchanges between different pairs of walkers can proceed in parallel.

14 Example: Salsa Operation

15 Experimental Evaluation Platform: Linux cluster (Intel Pentium 1.7 Ghz, 512 MB RAM, Linux , 100 Mbps interconnect) Simulations: Alanine Tripeptide Molecule using Hybrid Monte Carlo Temperatures exponentially disturbed within the range K 10 ns simulation time, 250,000 HMC cycles, each consisting of 10 4 fs Exchanges attempted every 25 steps Experiments: Salsa v/s MPI-based replica exchange Number of crosswalks, simulation time Effect of exchange temperature range

16 Experimental Evaluation: Number of Cross Walks A cross-walk is the event that a walker originally within the low temperature range reaches the upper temperature range (e.g. 650 K K) and then returns to the lower temperature range The rate of temperature equilibration is measured by the number of cross-walks (at equilibrium each walker visits each temperature with equal probability)

17 Experimental Evaluation: Simulation Time (a) Average wall-clock execution time and standard deviation with increasing number of processes (walkers); (b) Normalized execution time with increasing number of processes (walkers).

18 Experimental Evaluation: Effect of Posted Temperature Range The temperature range posted by a walker must be chosen to optimize simulation time, number of crosswalks, and convergence

19 Related Work Basic message-passing (MPI, PVM) based implementations use a simplified formulation of the algorithm Exchanges occur only between replicas with adjacent temperatures limits effectiveness Exchanges occur in a centralized and totally synchronous manner limits scalability Folding@HOME (Stanford U.) used a multiplexed replica exchange algorithm Uses multiplexed-replicas with a number of independent molecular dynamics runs at each temperature Attempts exchanges of configurations between these multiplexed-replicas Efficiency improves as there are a larger number of potential exchange partners available Salsa, to the best of our knowledge, is the first to address the decentralized and asynchronous parallel implementation of replica exchange Improves simulation efficiency and scalability by eliminating the limitation of nearest neighbor exchanges and enabling parallel decoupled and decentralized exchanges Can support large numbers of replicas and heterogeneous and loosely coupled pool of processors

20 Conclusion Salsa provides a semantically specialized virtual shared space abstraction to support scalable asynchronous replica exchange for molecular dynamics applications Enables general non-nearest neighboring temperature exchanges Exchanges are decoupled and asynchronously and dynamically determined Communications are decentralized and peer-to-peer, and occur in parallel Salsa is implemented as part of the IMPACT molecular mechanics package Effectiveness, performance and scalability of Salsa is experimentally demonstrated

21 Future Work The overall goal of the project is to enable largescale Grid-based parallel and distributed molecular simulations of protein structural changes and drug binding to proteins. Specific tasks include Implementing a prototype interaction and coordination framework, based on Salsa, for wide-area distributed replica exchange simulations Developing, deploying and evaluating the Grid-based Impact implementation Using the grid-based Impact implementation to provide scientific insights

Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications

Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications Li Zhang and Manish Parashar TASSL, CAIP Center Electrical and Computer Engineering Department Rutgers University