Parallel Monte Carlo Simulation of Colloidal Crystallization Jeff Boghosian Advisor Dr. Talid Sinno

Size: px
Start display at page:

Download "Parallel Monte Carlo Simulation of Colloidal Crystallization Jeff Boghosian Advisor Dr. Talid Sinno"

Transcription

1 Parallel Monte Carlo Simulation of Colloidal Crystallization Jeff Boghosian Advisor Dr. Talid Sinno Abstract The Monte Carlo algorithm is frequently used in particle simulations, but with increasing numbers of particles simulations may take days to run. Efforts have been made to parallelize Monte Carlo, but due to the nature of the algorithm parallel Monte Carlo algorithms are inherently inefficient. Simple methods of parallelizing the algorithm could result in data being sent across the network at every step, killing performance. I created a modified parallel Monte Carlo simulation to significantly reduce network traffic and eliminate any possible synchronization issues. I also developed an interactive 3-dimensional graphical user interface (GUI) to facilitate visualization of the system as it grows. With this interface, users are able to quickly see how the system is developing and understand the growth mechanisms. The GUI reads the text files that are outputted from the simulation at a fixed number of steps. The user has many visualization controls, including changing the color and size of particles, and different appearances for specific particle types and phases. Related work The de facto algorithm for Monte Carlo simulation is the Metropolis algorithm (Metropolis et. al., 1953). At each step, a particle is given a random displacement. If the energy of the system decreases, the move is accepted. Otherwise, the move is accepted with probability exp(- E/kT). While the algorithm has changed little since then, computing power has increased exponentially, allowing us to simulate larger and larger numbers of particles. The actual simulation in the Metropolis paper consisted of just 224 particles. By recent comparison, Molecular Dynamics simulations with over 12 billion atoms were reported running on a Cray supercomputer (Rapaport, 2006). The simulation used hundreds of processors and terabytes of memory in order to accomplish this.

2 The high cost of supercomputers has led to great interest in running parallel simulations on cheaper, commodity hardware. Significant improvements in running time have been achieved using particle subdivision and the Message Passing Interface in a Monte Carlo simulation (Carvalho et. al., 2000). The group used a master-slave pattern to parallelize the code. At each step, the master process would choose a particle and ask every other processor for the energy between the chosen particle and the subset of particles in that processor's particle list. During trials, four processors were connected via local network, and for 2048 particles, speedups near 2.5x were realized. The group also noticed larger speedups when there was a larger number of particles. There exist a few general-purpose molecular dynamics programs. MOLDY is a molecular dynamics modeling program which is written in C. This is a popular program, but it uses molecular dynamics instead of Monte Carlo. Also, since there are so many possible combinations of particles, system types, and interactions, it is impossible for any particular one to serve one's exact needs. Commercial modeling software exists, but it is very general and not specialized for this type of modeling. Accelrys sells a product called Materials Studio that does graphical modeling and simulations. The cost of full-featured commercial modeling software is astronomical though, and this doesn't seem to support the specific features we need. Finally, NAMD is an open source simulator and viewer for biomolecular molecules and systems. It has a pretty interface and nice graphics using OpenGL, but it is focused towards biology and it lacks the capability of viewing large particle systems. Technical Approach Monte Carlo is a random algorithm, so no two runs are the same. Each iteration, the algorithm applies a random move to each particle in sequence. After each individual move, the program calculates the change in energy for the system. Depending on the change in energy, the program can decide to accept the particle's new position, or reject the move and keep the particle in its old position. This step is repeated for every particle in the system. An iteration through the entire list of particles is commonly known as a sweep. Simulations may last thousands or millions of sweeps.

3 In the serial Monte Carlo simulation, nearest neighbor lists are kept in order to minimize unnecessary calculations (Verlet, 1967). When the program calculates energies for a given particle, it also keeps track of all of the particle's close neighbors and stores them in an array. This way, the next time the program is calculating energies for that particle, it only has to check the particles in the array since these are the only ones close enough to interact. The serial Monte Carlo algorithm is fairly straightforward, but there were many choices I had to make in parallelizing the code. First of all, there are several general techniques for parallelizing particle simulations. One such method is spatial subdivision, in which each machine simulates the particles in a different chunk of space. Since the colloidal potential function is very small at relatively large distances, we define a cutoff distance R c. We define the potential between a particle and any other particle outside of R c to be zero. Using this approximation, we only have to calculate energies for a particle using its nearest neighbors (Prasad 2005). Fig. 1. Spatial subdivision of particles. Separate machines process regions A, B, and C in parallel. Figure 1 shows spatial subdivision of a system. At any time, each particle is in region A, B, or C. Spatial subdivision assigns a machine to each region in this case, there are three machines. Each machine simulates the motions of atoms in its region. When particles are near the border, the machines need to communicate through the network in order to ensure that all neighboring interactions are considered.

4 Another way to accomplish parallelism is through particle subdivision (Prasad 2005). In this algorithm, each particle is assigned to a machine, and this machine controls the particle throughout the duration of the simulation. In Figure 1, the blue particles could be assigned to one machine and the green particles to another. The master process chooses a random particle to move, and informs the other processes of the potential move. Then, each slave process returns the energy between the moved particle and the subset of particles that process owns. The master then decides whether or not to accept the move and informs the others. The issue with this algorithm is that several messages need to be passed for each individual movement. If the cost of calculating energies is large compared to the network delay, this is not an issue. In our simulation, however, the interaction distance is very small, and processors are fast enough so the network latency would far outweigh the computation time. I chose to use spatial subdivision for my simulation. The simulations I am working with contain thousands of particles, so I wanted to reduce network traffic as much as possible. The colloidal particles that were being simulated had relatively short interaction distances. Since the cell length is much greater than the interaction distance, it is very likely that all of a given particle's neighbors will be on the same processor. This way, the number of queries made to neighboring cells is minimized. Parallel Monte Carlo algorithms are notoriously slow. The problem arises since every individual particle movement is done sequentially. In each step of the serial Monte Carlo algorithm, the position of a particle may change. This move may influence the energy of subsequent moves, so it is necessary that every other particle be aware of its new position. In the serial algorithm this isn't an issue, as all particles are stored in local memory on the same machine. However, in a parallel algorithm, after the particle's position is updated a message might need to be sent across the network to notify the other cells. The potential for network traffic at every step can potentially slow the algorithm to a crawl. Synchronization is also an issue with parallel Monte Carlo. If processors A and B are simultaneously choosing a particle within their space and making random moves, issues can arise.

5 If a particle in cell A is moved, cell B might be calculating the energy of a move based on A's old state. Since messages take a relatively long time to send, this problem could potentially arise very frequently. If this happens the simulation would no longer have any physical significance, since the simulation did not run as intended. It's quite possible that two particles could end up overlapping, a state that energetically should never exist. To synchronize individual particle movements would take a lot of network communication at each step, which is infeasible if we are trying to speed up the simulation. For comparison, Molecular Dynamics (MD) is one type of particle simulation that is frequently parallelized. The nature of the algorithm only requires that there is communication between processors after every complete sweep of the particles. In MD, the motion of the particles is not randomly assigned, but instead calculated by the forces applied by its neighboring particles. The forces acting on each particle are all calculated at once, and then the incremental movements of the particles can be made independently of each other, depending on the previously calculated forces. Finally, cells exchange information about their particles' movements and repeat the process (Prasad 2005). The difference between these algorithms is that in MD, you only need to update particle positions every sweep, but MC movements are individually made, so positions may have to be updated every step. I utilized the Message Passing Interface (MPI) for my simulation. MPI has become the industry standard for parallelization. While MPI itself is just a specification, there are many implementations of it, such as OpenMPI, MPICH, and LAM/MPI. MPI is comprised of over a hundred functions, but only a very small subset is actually necessary for full parallelism. The most important functions are MPI_Send and MPI_Recv, which send and receive messages from other processes. There are blocking and non-blocking versions of these, and they must be used carefully in order to avoid deadlock. It is quite easy to inadvertently cause deadlock. Suppose you want every cell to send data to its right neighbor and receive from its left. Simply calling MPI_Send followed by MPI_Recv in each process will cause deadlock each process will block at MPI_Send until the sent message is received but no process will ever reach the MPI_Recv since they are all blocked. A smarter approach might have even cells receive data first, and odd cells send data first (Gropp, Lusk, Skjellum, 1994).

6 The simulation space is a cube. For my simulation, I divided that cube into cells and then further divided the cells into subcells. To run the simulation, I planned on assigning one cell to each processor for maximum efficiency. Dividing the cells even further mainly had the benefit that I didn't have to keep neighbor lists for each cell the neighbors can be found in the subcells directly around the current subcell. The side length of the subcells must be more than the interaction cutoff, R c. This way you confine the particles that can potentially interact with those in a particular subcell to itself and its 26 neighbors (a 3 by 3 cube). This is a substantial speedup since the program won't needlessly check energies of particles that are guaranteed to be outside of R c. Fig. 2. The subdivision of the cells is shown here. Particles in the yellow subcells are the controlled by that cell, while particles in the white subcells are copied from neighboring cells. For instance, subcells 10, 15, and 20 in Cell 1 are copies of subcells 7, 12, and 17 in Cell 2. Local copies of subcells are stored to reduce network traffic. An extra outer layer of subcells was added to each cell in order to keep track of nearby particles owned by neighboring cells. In Figure 2, the white subcells comprise this extra layer. When calculating energies for particles in a border subcell such as number 14, we need to look at the particles stored in subcells 10, 15 and 20, among others. The particles in these subcells aren't actually owned by that cell, but having them stored locally saves expensive network traffic.

7 When an MPI program is initialized, it assigns each process a unique rank. Each process is running the same exact code, so the rank is necessary to differentiate between the cells. By convention, the master process (if there is one) is the one with rank 0. In my parallel simulation, the master cell creates the particles, either reading from an input file or assigning them random positions. It then distributes the particles to the appropriate slave cell, or itself, since the master cell acts like any others after distributing the particles. Once the particles are distributed, the simulation can begin. I actually had a significant portion of my code implemented before I realized why Monte Carlo simulations generally aren't parallelized. The synchronization issue was quite difficult and hard to circumvent. If I moved a particle near the border of a cell, it might interact with particles in a neighboring cell, but that cell might already be in the middle of calculating energies. I thought about ways to send messages back and forth so synchronize this, but it seemed infeasible. There could be any number of particles near the border, so any number of messages would have to be synchronized. Next, I wondered if I could predetermine the order in which particles were processed. The simulation would proceed as normal in each cell, but if a particle was near the border, the cell would only continue processing if it had precedence, or until it received an updated particle position from the neighboring cell and now it was this particle's turn. This I also discarded as too slow and too complicated. My idea was to run the simulation one subcell at a time. Every cell would start off simultaneously processing their version of the same subcell. After finishing that subcell, each cell would update its neighbors with the new positions, then move on to the next subcell, and repeat this process until every subcell was processed. The order in which the subcells were processed is randomly chosen each iteration in order to prevent any regional bias. Since any given subcell is in the same relative position within each cell, none of the subcells being run in parallel bordered each other. This eliminated any border synchronization issues. It also significantly reduced the amount of data that would be sent across the network. Data only needs to be sent after a surface subcell is processed interior subcells don't have any effect on particles in neighboring cells. The number of surface subcells increases as n 2/3 with the number of subcells. Now the program only had to transmit data after entire subcells were processed as opposed to after every particle.

8 Now that the particles were dispersed and the simulation could proceed, last next thing to take care of was particles wandering into another cell. The particle lists need to be updated when it is possible that two particles from non-neighboring subcells have moved far enough so that they can interact. The calculation of the interval is straightforward. Initially, two particles with a subcell between them have a separation of at least the size of the subcell, d s. The particles will start interacting when they are within the interaction cutoff, r c. The other constant is rd max, the maximum displacement a particle can have each step. Thus, it is necessary that 2*n* rd max + r c < d s. Depending on the constants used, the particle lists only need to be updated every steps. Once the lists are updated, there is no chance for two particles in non-neighboring cells to interact for at least another n steps. Graphical User Interface The GUI was developed as a completely separate component than the simulation. Modularizing the code eliminated dependencies and gave me more flexibility to write the GUI in any language I liked. The only link between the two parts is the files that the simulation outputs, which are read by the GUI and turned into a 3-D representation. One of the important features of this program is ability to realize that a simulation has gone awry while it is still running. Since the simulation can run for days at a time, it is important to know how the system is progressing during the simulation. The Java Swing interface was used for the GUI. The Java platform was chosen for interoperability on multiple platforms and because creating interfaces is very easy with just a little knowledge of the Swing libraries. Inserting an OpenGL canvas into the interface was very simple using the thirdparty JOGL libraries. Rendering thousands of particles per frame can be very costly, so I decided to render the solid particles as low-resolution spheres and the liquid particles as GL dots. Dots are rendered much faster than spheres, which are constructed of multiple polygons. Zoom, rotation, and translation controls were implemented so the user can have virtually any view of the system, and a slider bar was added to change frames to watch the growth of the system.

9 Fig. 3. An image of a colloidal system from the simulation. The spheres are solid-phase particles and the small dots are liquid-phase particles The GUI has already turned out to be very important, as it allowed me to see what was going on in the system during my simulation's test runs. Before I ironed out the bugs, I would see very odd things sometimes the crystal would disband completely within a few steps, or a whole section of the crystal would disappear right around the cell boundaries. This gave me an idea how useful the GUI could be in conjunction with the particle simulations. Conclusion Overall I'm satisfied with my project. The parallel Monte Carlo algorithm became the main thrust of my project even though it was the last thing I added. Originally I was planning on just doing the GUI and some energy calculations, but when I realized that wouldn't be enough I added the parallel simulation. Writing a parallel simulation was a learning experience for me, and one that I'm glad to have had. I'd never written parallel code before except for multi-threaded programs. I had to learn a lot,

10 starting right at the basics with MPI (which I didn't know existed) to parallel coding techniques and optimization. I had to learn the inner workings of the serial Monte Carlo simulation, with countless arrays, potential matrices, and neighbor lists, and then figure out a way to parallelize it. The challenge of parallelizing the code was not obvious at first, as I figured it couldn't be much different than parallelizing MD. A lot of thought was put in before I found a unique way to accomplish it. I'm disappointed I didn't have the time and resources to test the speedup of my parallel simulation on multiple machines. During development, I had an Open MPI implementation installed on my computer, and was testing the parallelism by having it spawn 8 or 27 processes locally. This worked, and actually pretty fast, considering all communication was still done using sockets and all 8 processes were running on the same CPU. in 6-7 minutes, 1000 sweeps were completed, each sweep being 5000 particles By the time I was completely finished and had the bugs ironed out I only had a week or so left. I wasn't able to secure enough workstations in the SEAS clusters in that time, so I instead tried to run them on a few of my friends' MacBooks. I had all sorts of issues with running MPI on these computers. Open MPI got the closest the master process started running, and I could see on the other machines that it had spawned processes through SSH. But after that, nothing happened. The main function of the program was never reached. MPICH never felt like running in parallel no matter what I did, only one process ran at a time. And the LAM/MPI daemons didn't help out either. My problems may have been because Mac is not the usual platform for MPI, but the OS is based on Linux and I was able to compile every implementation without problems so I don't see the issue there. The GUI turned out not to be a huge endeavor I didn't really expect it to be, but it is a really cool piece of software that will save a lot of time and effort when it is used by Dr. Sinno's group. It was a lot of fun to develop and play around with. I've had a good deal of experience doing interfaces for websites and applications, and I was able to apply this knowledge to make an easy-to-use piece of software that is portable and pretty fun.

11 References Carvalho A., Gomes, J., and Cordeiro M. (2000). Parallel Implementation of a Monte Carlo Molecular Simulation Program. J.Chem. Inf. Comput. Sci., 40(3), A parallel Monte Carlo implementation was created using particle subdivision by a doctoral student. This research claimed speedups of 2 using four processors. It was encouraging to see other attempts at parallel Monte Carlo simulation, but this work took place in 1999 with 266 MHz machines connected across 100Mb ethernet. Since then, processing power has increased exponentially, but 100Mb ethernet is generally still the standard. Now more than ever, network latency is the limiting factor in parallel computing, and the speedups seen in this research could be hard to reproduce due to network limitations. Gropp W, Lusk E., & Skjellum A. (1994). Using MPI: Portable Parallel Programming with the Message-Passing Interface. Cambridge, MA: The MIT Press. The authors of this reference manual were all part of the team that created the Message Passing Interface. At the time of printing, Gropp and Lusk were both Computer Scientists at Argonne National Laboratory, and Skjellum was an Assistant Professor of Computer Science at Mississippi State University. This guide on using MPI was written in 1994, just a couple years after MPI was standardized. Since the interface doesn't change, the text is just as relevant today as it was when it was written. Since then, MPI 2 has been specified, adding lots of new functionality. However, it is backwards compatible with MPI-1, and the new version has not seen widespread adoption. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. (1953). Equation of State Calculations by Fast Computing Machines. J. Chem. Phys., 21(6), This paper was the introduction of the Metropolis Monte Carlo Algorithm, which has become one of the most popular algorithms used today. The paper was published over fifty years ago yet the algorithm is still widely used. Nicholas Metropolis was a physicist who worked at Los Alamos National Laboratory during the Manhattan Project, and the coauthors were also esteemed physicists. This paper was very interesting to read, since computing methods were not nearly as advanced as today the largest simulation consisted of only 224 particles. Nonetheless, this is one of the seminal papers in the field of particle simulations.

12 Prasad, M. (2005). Multiscale modeling and simulation of aggregation in crystalline semiconductor materials. Unpublished doctoral dissertation, University of Pennsylvania. This dissertation from a Ph.D. student in Dr. Sinno's group was not focused on colloidal particles, but it utilized a parallel Molecular Dynamics simulation. I applied some of the ideas found in this dissertation, especially the transfer of bordering particles between neighboring cells. It also includes good descriptions of the different decomposition methods for parallelization, including atom, force, and spatial subdivision. Rapaport, D. (2006). Multibillion-atom molecular dynamics simulation: Design considerations for vector-parallel processing. Computer Physics Communications, 174(7), The author is a professor of physics who has seemingly made a career out of molecular dynamics, writing tens of papers about the subject. He seems to be an authority in cutting-edge simulations, so I didn't doubt his research. I included this to give an idea of the future of particle simulation. This paper was written in 2006, so it is one of the more recent attempts to increase the number of particles in simulations. Using 2016 processors on a Cray X1 supercomputer, over 12 billion atoms were simulated, require almost 2 TB of memory. Parallelism is clearly the future of particle simulation, as processor speeds aren't increasing by much any more and massive memory requirements demand it. Verlet, L. (1967). Computer Experiments on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules. Physical Review, 159(1), This is the paper in which Loup Verlet introduced his algorithm for keeping nearest neighbor lists. Obviously it is a very respected and important paper, since the lists are now named after him. In the paper, the technique is only briefly mentioned, but it has become a staple in Monte Carlo simulations. When calculating the energy of a particle, you only need to look at the particles in its neighbor list instead of looking at every other particle. This insight significantly reduced the running time of the simulations from n 2 to n*p, where p is the average number of particles in the nearest neighbor list. For my simulation, I didn't actually use nearest neighbor lists, since I already had a similar infrastructure in place. The subcells I created have the same effect as neighbor lists. They may contain more particles than necessary for any particular particle, but time is saved in not constructing the lists at all.

UCLA UCLA Previously Published Works

UCLA UCLA Previously Published Works UCLA UCLA Previously Published Works Title Parallel Markov chain Monte Carlo simulations Permalink https://escholarship.org/uc/item/4vh518kv Authors Ren, Ruichao Orkoulas, G. Publication Date 2007-06-01

More information

CS140 Final Project. Nathan Crandall, Dane Pitkin, Introduction:

CS140 Final Project. Nathan Crandall, Dane Pitkin, Introduction: Nathan Crandall, 3970001 Dane Pitkin, 4085726 CS140 Final Project Introduction: Our goal was to parallelize the Breadth-first search algorithm using Cilk++. This algorithm works by starting at an initial

More information

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between MITOCW Lecture 10A [MUSIC PLAYING] PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between all these high-level languages like Lisp and the query

More information

Blue Waters I/O Performance

Blue Waters I/O Performance Blue Waters I/O Performance Mark Swan Performance Group Cray Inc. Saint Paul, Minnesota, USA mswan@cray.com Doug Petesch Performance Group Cray Inc. Saint Paul, Minnesota, USA dpetesch@cray.com Abstract

More information

Read & Download (PDF Kindle) Data Structures And Other Objects Using Java (4th Edition)

Read & Download (PDF Kindle) Data Structures And Other Objects Using Java (4th Edition) Read & Download (PDF Kindle) Data Structures And Other Objects Using Java (4th Edition) Data Structures and Other Objects Using Java is a gradual, "just-in-time" introduction to Data Structures for a CS2

More information

Screencast: What is [Open] MPI?

Screencast: What is [Open] MPI? Screencast: What is [Open] MPI? Jeff Squyres May 2008 May 2008 Screencast: What is [Open] MPI? 1 What is MPI? Message Passing Interface De facto standard Not an official standard (IEEE, IETF, ) Written

More information

Screencast: What is [Open] MPI? Jeff Squyres May May 2008 Screencast: What is [Open] MPI? 1. What is MPI? Message Passing Interface

Screencast: What is [Open] MPI? Jeff Squyres May May 2008 Screencast: What is [Open] MPI? 1. What is MPI? Message Passing Interface Screencast: What is [Open] MPI? Jeff Squyres May 2008 May 2008 Screencast: What is [Open] MPI? 1 What is MPI? Message Passing Interface De facto standard Not an official standard (IEEE, IETF, ) Written

More information

Scaling Tuple-Space Communication in the Distributive Interoperable Executive Library. Jason Coan, Zaire Ali, David White and Kwai Wong

Scaling Tuple-Space Communication in the Distributive Interoperable Executive Library. Jason Coan, Zaire Ali, David White and Kwai Wong Scaling Tuple-Space Communication in the Distributive Interoperable Executive Library Jason Coan, Zaire Ali, David White and Kwai Wong August 18, 2014 Abstract The Distributive Interoperable Executive

More information

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables Instructor: Craig Duckett Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables 1 Assignment 1 is due LECTURE 5, Tuesday, April 10 th, 2018 in StudentTracker by MIDNIGHT MID-TERM

More information

Data Structures And Other Objects Using Java Download Free (EPUB, PDF)

Data Structures And Other Objects Using Java Download Free (EPUB, PDF) Data Structures And Other Objects Using Java Download Free (EPUB, PDF) This is the ebook of the printed book and may not include any media, website access codes, or print supplements that may come packaged

More information

Strawberry Ice Cream. Modeling the Ice Cream: Brennan Shacklett and Peter Do

Strawberry Ice Cream. Modeling the Ice Cream: Brennan Shacklett and Peter Do Strawberry Ice Cream Brennan Shacklett and Peter Do Modeling the Ice Cream: Our first approach was to simply attempt sculpting the ice cream from a sphere manually in blender, but we rapidly discovered

More information

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No # 09 Lecture No # 40 This is lecture forty of the course on

More information

Petascale Multiscale Simulations of Biomolecular Systems. John Grime Voth Group Argonne National Laboratory / University of Chicago

Petascale Multiscale Simulations of Biomolecular Systems. John Grime Voth Group Argonne National Laboratory / University of Chicago Petascale Multiscale Simulations of Biomolecular Systems John Grime Voth Group Argonne National Laboratory / University of Chicago About me Background: experimental guy in grad school (LSCM, drug delivery)

More information

Title Unknown Annapurna Valluri

Title Unknown Annapurna Valluri Title Unknown Annapurna Valluri 1. Introduction There are a number of situations, one comes across in one s life, in which one has to find the k nearest neighbors of an object, be it a location on a map,

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

I'm Andy Glover and this is the Java Technical Series of. the developerworks podcasts. My guest is Brian Jakovich. He is the

I'm Andy Glover and this is the Java Technical Series of. the developerworks podcasts. My guest is Brian Jakovich. He is the I'm Andy Glover and this is the Java Technical Series of the developerworks podcasts. My guest is Brian Jakovich. He is the director of Elastic Operations for Stelligent. He and I are going to talk about

More information

CSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC)

CSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC) Parallel Algorithms on a cluster of PCs Ian Bush Daresbury Laboratory I.J.Bush@dl.ac.uk (With thanks to Lorna Smith and Mark Bull at EPCC) Overview This lecture will cover General Message passing concepts

More information

Welfare Navigation Using Genetic Algorithm

Welfare Navigation Using Genetic Algorithm Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such

More information

Multiprocessor Systems

Multiprocessor Systems White Paper: Virtex-II Series R WP162 (v1.1) April 10, 2003 Multiprocessor Systems By: Jeremy Kowalczyk With the availability of the Virtex-II Pro devices containing more than one Power PC processor and

More information

Adaptive Assignment for Real-Time Raytracing

Adaptive Assignment for Real-Time Raytracing Adaptive Assignment for Real-Time Raytracing Paul Aluri [paluri] and Jacob Slone [jslone] Carnegie Mellon University 15-418/618 Spring 2015 Summary We implemented a CUDA raytracer accelerated by a non-recursive

More information

Most real programs operate somewhere between task and data parallelism. Our solution also lies in this set.

Most real programs operate somewhere between task and data parallelism. Our solution also lies in this set. for Windows Azure and HPC Cluster 1. Introduction In parallel computing systems computations are executed simultaneously, wholly or in part. This approach is based on the partitioning of a big task into

More information

An introduction to the VDI landscape

An introduction to the VDI landscape The : An Virtual desktop infrastructures are quickly gaining popularity in the IT industry as end users are now able to connect to their desktops from any location, at any time. This e-guide, from SearchVirtualDesktop.com,

More information

MITOCW MIT6_172_F10_lec18_300k-mp4

MITOCW MIT6_172_F10_lec18_300k-mp4 MITOCW MIT6_172_F10_lec18_300k-mp4 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for

More information

CUDA GPGPU Workshop 2012

CUDA GPGPU Workshop 2012 CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline

More information

Parallelism and Concurrency. COS 326 David Walker Princeton University

Parallelism and Concurrency. COS 326 David Walker Princeton University Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary

More information

GENERAL-PURPOSE COMPUTATION USING GRAPHICAL PROCESSING UNITS

GENERAL-PURPOSE COMPUTATION USING GRAPHICAL PROCESSING UNITS GENERAL-PURPOSE COMPUTATION USING GRAPHICAL PROCESSING UNITS Adrian Salazar, Texas A&M-University-Corpus Christi Faculty Advisor: Dr. Ahmed Mahdy, Texas A&M-University-Corpus Christi ABSTRACT Graphical

More information

Manually Sync Itouch Touch Itunes Wont Let Me Update My Music To My

Manually Sync Itouch Touch Itunes Wont Let Me Update My Music To My Manually Sync Itouch Touch Itunes Wont Let Me Update My Music To My i was lost my music library when my ipod was connected to wifi. can anyone tell me the shuffle option doesn't work with the 8.4 software

More information

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis Computer Science 210 Data Structures Siena College Fall 2017 Topic Notes: Complexity and Asymptotic Analysis Consider the abstract data type, the Vector or ArrayList. This structure affords us the opportunity

More information

CSE494 Information Retrieval Project C Report

CSE494 Information Retrieval Project C Report CSE494 Information Retrieval Project C Report By: Jianchun Fan Introduction In project C we implement several different clustering methods on the query results given by pagerank algorithms. The clustering

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING

JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING Larson Hogstrom, Mukarram Tahir, Andres Hasfura Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 18.337/6.338

More information

Domain Decomposition for Colloid Clusters. Pedro Fernando Gómez Fernández

Domain Decomposition for Colloid Clusters. Pedro Fernando Gómez Fernández Domain Decomposition for Colloid Clusters Pedro Fernando Gómez Fernández MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2004 Authorship declaration I, Pedro Fernando

More information

Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5

Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5 Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5 [talking head] Formal Methods of Software Engineering means the use of mathematics as an aid to writing programs. Before we can

More information

Available Optimization Methods

Available Optimization Methods URL: http://cxc.harvard.edu/sherpa3.4/methods/methods.html Last modified: 11 January 2007 Return to: Optimization Methods Index Available Optimization Methods The primary task of Sherpa is to fit a model

More information

Data parallel algorithms 1

Data parallel algorithms 1 Data parallel algorithms (Guy Steele): The data-parallel programming style is an approach to organizing programs suitable for execution on massively parallel computers. In this lecture, we will characterize

More information

Educational Fusion. Implementing a Production Quality User Interface With JFC

Educational Fusion. Implementing a Production Quality User Interface With JFC Educational Fusion Implementing a Production Quality User Interface With JFC Kevin Kennedy Prof. Seth Teller 6.199 May 1999 Abstract Educational Fusion is a online algorithmic teaching program implemented

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

MITOCW watch?v=rvrkt-jxvko

MITOCW watch?v=rvrkt-jxvko MITOCW watch?v=rvrkt-jxvko The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To

More information

2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or

2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The

More information

Geant4 v9.5. Kernel III. Makoto Asai (SLAC) Geant4 Tutorial Course

Geant4 v9.5. Kernel III. Makoto Asai (SLAC) Geant4 Tutorial Course Geant4 v9.5 Kernel III Makoto Asai (SLAC) Geant4 Tutorial Course Contents Fast simulation (Shower parameterization) Multi-threading Computing performance Kernel III - M.Asai (SLAC) 2 Fast simulation (shower

More information

Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing. October 23, FluidSim.

Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing. October 23, FluidSim. Praktikum 2014 Parallele Programmierung Universität Hamburg Dept. Informatics / Scientific Computing October 23, 2014 Paul Bienkowski Author 2bienkow@informatik.uni-hamburg.de Dr. Julian Kunkel Supervisor

More information

JSish. Ryan Grasell. June For my senior project, I implemented Professor Keen s JSish spec in C++. JSish

JSish. Ryan Grasell. June For my senior project, I implemented Professor Keen s JSish spec in C++. JSish JSish Ryan Grasell June 2015 1 Introduction For my senior project, I implemented Professor Keen s JSish spec in C++. JSish is a subset of Javascript with support for execution from the command line and

More information

MITOCW watch?v=penh4mv5gag

MITOCW watch?v=penh4mv5gag MITOCW watch?v=penh4mv5gag PROFESSOR: Graph coloring is the abstract version of a problem that arises from a bunch of conflict scheduling situations. So let's look at an example first and then define the

More information

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions Data In single-program multiple-data (SPMD) parallel programs, global data is partitioned, with a portion of the data assigned to each processing node. Issues relevant to choosing a partitioning strategy

More information

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 23 Hierarchical Memory Organization (Contd.) Hello

More information

Efficient String Concatenation in Python

Efficient String Concatenation in Python Efficient String Concatenation in Python An assessment of the performance of several methods Source : http://www.skymind.com/~ocrow/python_string/ Introduction Building long strings in the Python progamming

More information

Programming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho

Programming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho Programming with MPI on GridRS Dr. Márcio Castro e Dr. Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage -

More information

Finding a needle in Haystack: Facebook's photo storage

Finding a needle in Haystack: Facebook's photo storage Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,

More information

MITOCW watch?v=flgjisf3l78

MITOCW watch?v=flgjisf3l78 MITOCW watch?v=flgjisf3l78 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To

More information

Information Coding / Computer Graphics, ISY, LiTH

Information Coding / Computer Graphics, ISY, LiTH Sorting on GPUs Revisiting some algorithms from lecture 6: Some not-so-good sorting approaches Bitonic sort QuickSort Concurrent kernels and recursion Adapt to parallel algorithms Many sorting algorithms

More information

Pervasive PSQL Summit v10 Highlights Performance and analytics

Pervasive PSQL Summit v10 Highlights Performance and analytics Pervasive PSQL Summit v10 Highlights Performance and analytics A Monash Information Services Bulletin by Curt A. Monash, PhD. September, 2007 Sponsored by: Pervasive PSQL Version 10 Highlights Page 2 PSQL

More information

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9 slide 2/41 Contents Slide Set 9 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014

More information

David DeFlyer Class notes CS162 January 26 th, 2009

David DeFlyer Class notes CS162 January 26 th, 2009 1. Class opening: 1. Handed out ACM membership information 2. Review of last lecture: 1. operating systems were something of an ad hoc component 2. in the 1960s IBM tried to produce a OS for all customers

More information

Midterm Exam Amy Murphy 19 March 2003

Midterm Exam Amy Murphy 19 March 2003 University of Rochester Midterm Exam Amy Murphy 19 March 2003 Computer Systems (CSC2/456) Read before beginning: Please write clearly. Illegible answers cannot be graded. Be sure to identify all of your

More information

IP subnetting made easy

IP subnetting made easy Version 1.0 June 28, 2006 By George Ou Introduction IP subnetting is a fundamental subject that's critical for any IP network engineer to understand, yet students have traditionally had a difficult time

More information

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 8 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

Week 12: Running Time and Performance

Week 12: Running Time and Performance Week 12: Running Time and Performance 1 Most of the problems you have written in this class run in a few seconds or less Some kinds of programs can take much longer: Chess algorithms (Deep Blue) Routing

More information

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi. Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

More information

Meshes: Catmull-Clark Subdivision and Simplification

Meshes: Catmull-Clark Subdivision and Simplification Meshes: Catmull-Clark Subdivision and Simplification Part 1: What I did CS 838, Project 1 Eric Aderhold My main goal with this project was to learn about and better understand three-dimensional mesh surfaces.

More information

Scalability of Processing on GPUs

Scalability of Processing on GPUs Scalability of Processing on GPUs Keith Kelley, CS6260 Final Project Report April 7, 2009 Research description: I wanted to figure out how useful General Purpose GPU computing (GPGPU) is for speeding up

More information

Molecular Dynamics Simulations with Julia

Molecular Dynamics Simulations with Julia Emily Crabb 6.338/18.337 Final Project Molecular Dynamics Simulations with Julia I. Project Overview This project consists of one serial and several parallel versions of a molecular dynamics simulation

More information

Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows

Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows Benzi John* and M. Damodaran** Division of Thermal and Fluids Engineering, School of Mechanical and Aerospace

More information

Adventures in Load Balancing at Scale: Successes, Fizzles, and Next Steps

Adventures in Load Balancing at Scale: Successes, Fizzles, and Next Steps Adventures in Load Balancing at Scale: Successes, Fizzles, and Next Steps Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory Outline Introduction Two abstract programming

More information

Programming with MPI. Pedro Velho

Programming with MPI. Pedro Velho Programming with MPI Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage - Who might be interested in those applications?

More information

Who am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration

Who am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration Who am I? I m a python developer who has been working on OpenStack since 2011. I currently work for Aptira, who do OpenStack, SDN, and orchestration consulting. I m here today to help you learn from my

More information

Optimizing Molecular Dynamics

Optimizing Molecular Dynamics Optimizing Molecular Dynamics This chapter discusses performance tuning of parallel and distributed molecular dynamics (MD) simulations, which involves both: (1) intranode optimization within each node

More information

OBJECT ORIENTED SOFTWARE DEVELOPMENT USING JAVA (2ND EDITION) BY XIAOPING JIA

OBJECT ORIENTED SOFTWARE DEVELOPMENT USING JAVA (2ND EDITION) BY XIAOPING JIA Read Online and Download Ebook OBJECT ORIENTED SOFTWARE DEVELOPMENT USING JAVA (2ND EDITION) BY XIAOPING JIA DOWNLOAD EBOOK : OBJECT ORIENTED SOFTWARE DEVELOPMENT USING Click link bellow and free register

More information

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 HAI THANH MAI AND MYOUNG HO KIM Department of Computer Science Korea Advanced Institute of Science

More information

Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 24 Solid Modelling

Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 24 Solid Modelling Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 24 Solid Modelling Welcome to the lectures on computer graphics. We have

More information

Introduction to Parallel Computing. CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014

Introduction to Parallel Computing. CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014 Introduction to Parallel Computing CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014 1 Definition of Parallel Computing Simultaneous use of multiple compute resources to solve a computational

More information

Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides

Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides for both problems first, and let you guys code them

More information

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16.

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16. Scientific Computing with Case Studies SIAM Press, 29 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit IV Monte Carlo Computations Dianne P. O Leary c 28 What is a Monte-Carlo method?

More information

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster

More information

To: 10/18/80 Multics Technical Bulletin MTB-455. W. Olin Sibert MTB Distribution October 18, From: Date:

To: 10/18/80 Multics Technical Bulletin MTB-455. W. Olin Sibert MTB Distribution October 18, From: Date: 10/18/80 Multics Technical Bulletin MTB-455 From: To: Date: Subject: (or) W. Olin Sibert MTB Distribution October 18, 1980 Desupporting the Bulk Store, Whither Page Multi-Level? This MTB discusses the

More information

INTRODUCTION. Chapter GENERAL

INTRODUCTION. Chapter GENERAL Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which

More information

MapReduce and Friends

MapReduce and Friends MapReduce and Friends Craig C. Douglas University of Wyoming with thanks to Mookwon Seo Why was it invented? MapReduce is a mergesort for large distributed memory computers. It was the basis for a web

More information

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark by Nkemdirim Dockery High Performance Computing Workloads Core-memory sized Floating point intensive Well-structured

More information

Case study on PhoneGap / Apache Cordova

Case study on PhoneGap / Apache Cordova Chapter 1 Case study on PhoneGap / Apache Cordova 1.1 Introduction to PhoneGap / Apache Cordova PhoneGap is a free and open source framework that allows you to create mobile applications in a cross platform

More information

F003 Monte-Carlo Statics on Large 3D Wide-azimuth Data

F003 Monte-Carlo Statics on Large 3D Wide-azimuth Data F003 Monte-Carlo Statics on Large 3D Wide-azimuth Data D. Le Meur* (CGGVeritas) SUMMARY Estimation of surface-consistent residual statics on large 3D wide-azimuth data using a Monte-Carlo approach is a

More information

Software Engineering at VMware Dan Scales May 2008

Software Engineering at VMware Dan Scales May 2008 Software Engineering at VMware Dan Scales May 2008 Eng_BC_Mod 1.Product Overview v091806 The Challenge Suppose that you have a very popular software platform: that includes hardware-level and OS code that

More information

Virtual Memory. Chapter 8

Virtual Memory. Chapter 8 Chapter 8 Virtual Memory What are common with paging and segmentation are that all memory addresses within a process are logical ones that can be dynamically translated into physical addresses at run time.

More information

Distributed Virtual Reality Computation

Distributed Virtual Reality Computation Jeff Russell 4/15/05 Distributed Virtual Reality Computation Introduction Virtual Reality is generally understood today to mean the combination of digitally generated graphics, sound, and input. The goal

More information

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes: BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General

More information

MPI History. MPI versions MPI-2 MPICH2

MPI History. MPI versions MPI-2 MPICH2 MPI versions MPI History Standardization started (1992) MPI-1 completed (1.0) (May 1994) Clarifications (1.1) (June 1995) MPI-2 (started: 1995, finished: 1997) MPI-2 book 1999 MPICH 1.2.4 partial implemention

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

GPU Optimized Monte Carlo

GPU Optimized Monte Carlo GPU Optimized Monte Carlo Jason Mick, Eyad Hailat, Kamel Rushaidat, Yuanzhe Li, Loren Schwiebert, and Jeffrey J. Potoff Department of Chemical Engineering & Materials Science, and Department of Computer

More information

10 Strategies for Effective Marketing Campaigns

10 Strategies for Effective  Marketing Campaigns 10 Strategies for Effective Email Marketing Campaigns Most people do not send effective email messages. I know. I spend a lot of time analyzing email messages for our clients, and measuring and tracking

More information

Partitioning Effects on MPI LS-DYNA Performance

Partitioning Effects on MPI LS-DYNA Performance Partitioning Effects on MPI LS-DYNA Performance Jeffrey G. Zais IBM 138 Third Street Hudson, WI 5416-1225 zais@us.ibm.com Abbreviations: MPI message-passing interface RISC - reduced instruction set computing

More information

Next-Generation Parallel Query

Next-Generation Parallel Query Next-Generation Parallel Query Robert Haas & Rafia Sabih 2013 EDB All rights reserved. 1 Overview v10 Improvements TPC-H Results TPC-H Analysis Thoughts for the Future 2017 EDB All rights reserved. 2 Parallel

More information

MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth

MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth Success Story: MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth Atlantic.Net specializes in providing security and compliance hosting solutions, most specifically in the

More information

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions.

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions. 1 of 10 3/3/2005 10:51 AM Linux Magazine March 2004 C++ Parallel Increase application performance without changing your source code. Parallel Processing Top manufacturer of multiprocessing video & imaging

More information

Ghost Cell Pattern. Fredrik Berg Kjolstad. January 26, 2010

Ghost Cell Pattern. Fredrik Berg Kjolstad. January 26, 2010 Ghost Cell Pattern Fredrik Berg Kjolstad University of Illinois Urbana-Champaign, USA kjolsta1@illinois.edu Marc Snir University of Illinois Urbana-Champaign, USA snir@illinois.edu January 26, 2010 Problem

More information

Study Guide Processes & Job Control

Study Guide Processes & Job Control Study Guide Processes & Job Control Q1 - PID What does PID stand for? Q2 - Shell PID What shell command would I issue to display the PID of the shell I'm using? Q3 - Process vs. executable file Explain,

More information

Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem

Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem Performance Comparison between Blocking and Non-Blocking Communications for a Three-Dimensional Poisson Problem Guan Wang and Matthias K. Gobbert Department of Mathematics and Statistics, University of

More information

LARGE-SCALE MOLECULAR-DYNAMICS SIMULATION OF 19 BILLION PARTICLES

LARGE-SCALE MOLECULAR-DYNAMICS SIMULATION OF 19 BILLION PARTICLES International Journal of Modern Physics C Vol. 15, No. 1 (2004) 193 201 c World Scientific Publishing Company LARGE-SCALE MOLECULAR-DYNAMICS SIMULATION OF 19 BILLION PARTICLES KAI KADAU Theoretical Division,

More information

Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions.

Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions. 1 2 Shadows for Many Lights sounds like it might mean something, but In fact it can mean very different things, that require very different solutions. 3 We aim for something like the numbers of lights

More information

Performance potential for simulating spin models on GPU

Performance potential for simulating spin models on GPU Performance potential for simulating spin models on GPU Martin Weigel Institut für Physik, Johannes-Gutenberg-Universität Mainz, Germany 11th International NTZ-Workshop on New Developments in Computational

More information