Foster s Methodology: Application Examples

Similar documents
Parallel Computing. Parallel Algorithm Design

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Parallel Algorithm Design. Parallel Algorithm Design p. 1

All-Pairs Shortest Paths - Floyd s Algorithm

Monte Carlo Methods; Combinatorial Search

Combinatorial Search; Monte Carlo Methods

Parallel Algorithm Design

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

CS 470 Spring Parallel Algorithm Development. (Foster's Methodology) Mike Lam, Professor

Monitors; Software Transactional Memory

Copyright 2010, Elsevier Inc. All rights Reserved

Monitors; Software Transactional Memory

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)

COMMUNICATION IN HYPERCUBES

Driven Cavity Example

Matrix-vector Multiplication

Adaptive-Mesh-Refinement Pattern

Dynamic load balancing in OSIRIS

Concurrent Programming with OpenMP

Parallel Programming in C with MPI and OpenMP

COSC 462. Parallel Algorithms. The Design Basics. Piotr Luszczek

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8

Introduction to Multigrid and its Parallelization

COMP/CS 605: Introduction to Parallel Computing Topic: Parallel Computing Overview/Introduction

Week 3: MPI. Day 04 :: Domain decomposition, load balancing, hybrid particlemesh

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Parallel Real-Time Systems

L21: Putting it together: Tree Search (Ch. 6)!

Introduction to High Performance Computing

L15: Putting it together: N-body (Ch. 6)!

Partial Differential Equations

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

High Performance Computing. University questions with solution

Acknowledgements. Prof. Dan Negrut Prof. Darryl Thelen Prof. Michael Zinn. SBEL Colleagues: Hammad Mazar, Toby Heyn, Manoj Kumar

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

L20: Putting it together: Tree Search (Ch. 6)!

Parallel Programming. Matrix Decomposition Options (Matrix-Vector Product)

Lecture 4: Locality and parallelism in simulation I

Thermal Coupling Method Between SPH Particles and Solid Elements in LS-DYNA

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Three dimensional modeling using ponderomotive guiding center solver

GPU-based Distributed Behavior Models with CUDA

Sorting (Chapter 9) Alexandre David B2-206

Java Monitors. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Domain Decomposition: Computational Fluid Dynamics

Particle Tracing Module

Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Parallel Programming with MPI and OpenMP

Linear systems of equations

Continuum-Microscopic Models

Sorting (Chapter 9) Alexandre David B2-206

Collision and Proximity Queries

A New Approach to Modeling Physical Systems: Discrete Event Simulations of Grid-based Models

Multi-Domain Pattern. I. Problem. II. Driving Forces. III. Solution

Fast Methods with Sieve

Graph Algorithms. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Domain Decomposition: Computational Fluid Dynamics

Data Structures - Binary Trees and Operations on Binary Trees

Parallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves

Introduction to Parallel Computing

Domain Decomposition: Computational Fluid Dynamics

Modeling and simulation the incompressible flow through pipelines 3D solution for the Navier-Stokes equations

Lecture 3: Sorting 1

Example 13 - Shock Tube

MPI Casestudy: Parallel Image Processing

Principle Of Parallel Algorithm Design (cont.) Alexandre David B2-206

Australian Journal of Basic and Applied Sciences, 3(2): , 2009 ISSN

Parallelization of Scientific Applications (II)

Basic Communication Operations (Chapter 4)

The goal is the definition of points with numbers and primitives with equations or functions. The definition of points with numbers requires a

Coping with the Ice Accumulation Problems on Power Transmission Lines

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8

Numerical Simulation of Flow around a Spur Dike with Free Surface Flow in Fixed Flat Bed. Mukesh Raj Kafle

CSG obj. oper3. obj1 obj2 obj3. obj5. obj4

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation

A Pattern-supported Parallelization Approach

Let and be a differentiable function. Let Then be the level surface given by

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed

Message Passing Interface - MPI

Matrix multiplication

Simulation in Computer Graphics. Particles. Matthias Teschner. Computer Science Department University of Freiburg

Overview. Processor organizations Types of parallel machines. Real machines

Lecture 5. Applications: N-body simulation, sorting, stencil methods

Advanced Parallel Programming

The Vibrating String

Message Passing Interface - MPI

Development of the Compliant Mooring Line Model for FLOW-3D

Maximum Differential Graph Coloring

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort

6.141: Robotics systems and science Lecture 10: Implementing Motion Planning

Parallel Programming Concepts. Parallel Algorithms. Peter Tröger

course outline basic principles of numerical analysis, intro FEM

White paper ETERNUS CS800 Data Deduplication Background

Parallel Implementation of 3D FMA using MPI

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization

Transcription:

Foster s Methodology: Application Examples Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 19, 2011 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 1 / 25

Outline Foster s design methodology partitioning communication agglomeration mapping Application Examples Boundary value problem Finding the maximum n-body problem CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 2 / 25

Distributed Memory Systems Parallel programming of distributed-memory systems is significantly different from shared-memory systems essentially due to the very large overhead in terms of: communication task initialization / termination efficient parallelization requires that these effects be taken into account from the start! CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 3 / 25

Foster s Design Methodology Problem Partitioning Communication Primitive Tasks Agglomeration CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 4 / 25

Foster s Design Methodology Problem Partitioning Communication Primitive Tasks Agglomeration CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 4 / 25

Foster s Design Methodology Problem Partitioning Communication Primitive Tasks Agglomeration CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 4 / 25

Foster s Design Methodology Problem Partitioning Communication Primitive Tasks Agglomeration CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 4 / 25

Foster s Design Methodology Problem Partitioning Communication Primitive Tasks Agglomeration Mapping CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 4 / 25

Boundary Value Problem Problem Determine the evolution of the temperature of a rod in the following conditions: rod of length 1 both ends of the rod at 0 o C uniform material insulated except at the ends initial temperature of a point at distance x is 100 sin(π x) CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 5 / 25

Boundary Value Problem Problem Determine the evolution of the temperature of a rod in the following conditions: rod of length 1 both ends of the rod at 0 o C uniform material insulated except at the ends initial temperature of a point at distance x is 100 sin(π x) A partial differential equation (PDE) governs the temperature of the rod in time. Typically too complicated to solve analytically. CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 5 / 25

Boundary Value Problem Finite difference methods can be used to obtain an approximate solution to these complex problems discretize space and time: unit-length rod divided into n sections of length h time from 0 to P divided into m periods of length k We obtain a grid n m of temperatures T (x, t). CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 6 / 25

Boundary Value Problem Finite difference methods can be used to obtain an approximate solution to these complex problems discretize space and time: unit-length rod divided into n sections of length h time from 0 to P divided into m periods of length k We obtain a grid n m of temperatures T (x, t). Temperature over time is given by: T (x, t) = r T (x 1, t 1) + (1 2r) T (x, t 1) + r T (x + 1, t 1) where r = k/h 2. CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 6 / 25

Boundary Value Problem CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 7 / 25

Boundary Value Problem Partitioning: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 8 / 25

Boundary Value Problem Partitioning: Make each T (x, t) computation a primitive task. 2-dimensional domain decomposition t m n x CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 8 / 25

Boundary Value Problem Communication: t m n x CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 9 / 25

Boundary Value Problem Communication: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 9 / 25

Boundary Value Problem Agglomeration: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 10 / 25

Boundary Value Problem Agglomeration: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 10 / 25

Boundary Value Problem Agglomeration: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 11 / 25

Boundary Value Problem Agglomeration: Mapping: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 11 / 25

Boundary Value Problem Agglomeration: Mapping: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 11 / 25

Boundary Value Problem Analysis of execution time: C: time to compute T (x, t) = rt (x 1, t 1) + (1 2r)T (x, t 1) + rt (x + 1, t 1) D: time to send/receive one T (x, t) from another processor p: number of processors Sequential algorithm: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 12 / 25

Boundary Value Problem Analysis of execution time: C: time to compute T (x, t) = rt (x 1, t 1) + (1 2r)T (x, t 1) + rt (x + 1, t 1) D: time to send/receive one T (x, t) from another processor p: number of processors Sequential algorithm: mnc Parallel algorithm: computation per time instant: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 12 / 25

Boundary Value Problem Analysis of execution time: C: time to compute T (x, t) = rt (x 1, t 1) + (1 2r)T (x, t 1) + rt (x + 1, t 1) D: time to send/receive one T (x, t) from another processor p: number of processors Sequential algorithm: mnc Parallel algorithm: computation per time instant: communication per time instant: n p C CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 12 / 25

Boundary Value Problem Analysis of execution time: C: time to compute T (x, t) = rt (x 1, t 1) + (1 2r)T (x, t 1) + rt (x + 1, t 1) D: time to send/receive one T (x, t) from another processor p: number of processors Sequential algorithm: mnc Parallel algorithm: computation per time instant: n p C communication per time instant: 2D (receiving is synchronous, sending asynchronous) Total: m( n p C + 2D) CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 12 / 25

Finding the Maximum Problem Determine the maximum over a set of n values. This is a particular case of a reduction: a 0 a 1 a 2 a n 1 where can be any associative binary operator. a reduction always takes Θ(n) time on a sequential computer CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 13 / 25

Finding the Maximum Partitioning: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 14 / 25

Finding the Maximum Partitioning: Make each value a primitive task. 1-dimensional domain decomposition One task will compute final solution: root task CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 14 / 25

Finding the Maximum Communication: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 15 / 25

Finding the Maximum Communication: n/4-1 tasks n/4-1 tasks n-1 tasks n/2-1 tasks CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 15 / 25

Finding the Maximum Communication: n/4-1 tasks n/4-1 tasks n/2-1 tasks n/2-1 tasks CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 15 / 25

Finding the Maximum Communication: n/4-1 tasks n/4-1 tasks n/4-1 tasks n/4-1 tasks CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 15 / 25

Finding the Maximum Communication: n/4-1 tasks n/4-1 tasks n/4-1 tasks n/4-1 tasks Continue recursively: Binomial Tree CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 15 / 25

Binomial Trees Recursive definition of Binomial Tree, with n = 2 k nodes: Bk-1 Bk-1 B0 Bk CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 16 / 25

Binomial Trees Recursive definition of Binomial Tree, with n = 2 k nodes: Bk-1 Bk-1 B0 Bk B 0 B 1 B 2 B 3 B 4 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 16 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 1 7 0 8 5-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 1 7 0 8 5-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 1 7 0 8 7-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 1 7 0 8 7-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 7 7 0 8 7-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Illustrative example: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 8 7 7 0 8 7-4 -3 3 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 17 / 25

Finding the Maximum Agglomeration: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 18 / 25

Finding the Maximum Agglomeration: Group n leafs of the tree: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 18 / 25

Finding the Maximum Agglomeration: Group n leafs of the tree: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 Mapping: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 18 / 25

Finding the Maximum Agglomeration: Group n leafs of the tree: 1 1 5-1 8 5-9 -3 2 7 0 7 3-4 3 7 Mapping: The same (actually, in the agglomeration phase, use n such that you end up with p tasks). CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 18 / 25

Finding the Maximum Analysis of execution time: C: time to perform the binary operation (maximum) D: time to send/receive one value from another processor p: number of processors Sequential algorithm: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 19 / 25

Finding the Maximum Analysis of execution time: C: time to perform the binary operation (maximum) D: time to send/receive one value from another processor p: number of processors Sequential algorithm: (n 1)C Parallel algorithm: computation in the leafs: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 19 / 25

Finding the Maximum Analysis of execution time: C: time to perform the binary operation (maximum) D: time to send/receive one value from another processor p: number of processors Sequential algorithm: (n 1)C Parallel algorithm: computation in the leafs: ( n p 1)C computation up the tree: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 19 / 25

Finding the Maximum Analysis of execution time: C: time to perform the binary operation (maximum) D: time to send/receive one value from another processor p: number of processors Sequential algorithm: (n 1)C Parallel algorithm: computation in the leafs: ( n p 1)C computation up the tree: log p (C + D) Total: ( n p 1)C + log p (C + D) CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 19 / 25

n-body Problem Problem Simulate the motion of n particles of varying masses in two dimensions. compute new positions and velocities consider gravitational interactions only Straightforward sequential algorithms solve this problem in Θ(n 2 ) time (however, better time complexity algorithms exist) CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 20 / 25

n-body Problem Partitioning: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Partitioning: Make each particle a primitive task. 1-dimensional domain decomposition Communication: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Partitioning: Make each particle a primitive task. 1-dimensional domain decomposition Communication: All tasks need to communicate with all tasks! Gather operation: one task receive a dataset from all tasks. All-gather operation: all tasks receive a dataset from all tasks. 0 1 0 1 2 3 2 3 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Partitioning: Make each particle a primitive task. 1-dimensional domain decomposition Communication: All tasks need to communicate with all tasks! Gather operation: one task receive a dataset from all tasks. All-gather operation: all tasks receive a dataset from all tasks. 0 1 0 1 2 3 2 3 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Partitioning: Make each particle a primitive task. 1-dimensional domain decomposition Communication: All tasks need to communicate with all tasks! Gather operation: one task receive a dataset from all tasks. All-gather operation: all tasks receive a dataset from all tasks. 0 1 0 1 2 3 2 3 CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Partitioning: Make each particle a primitive task. 1-dimensional domain decomposition Communication: All tasks need to communicate with all tasks! Gather operation: one task receive a dataset from all tasks. All-gather operation: all tasks receive a dataset from all tasks. 0 1 0 1 2 3 2 3 Implement communication using a hypercube topology! CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 21 / 25

n-body Problem Agglomeration / Mapping: All primitive tasks have the same computation cost and communication pattern no particular strategy for agglomeration required Agglomerate n/p primitive tasks, so that we have one task per processor. CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 22 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: one message with k data units: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: one message with k data units: D + k B depth of tree: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: one message with k data units: D + k B depth of tree: log p length of message at depth i: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: one message with k data units: D + k B depth of tree: log p length of message at depth i: 2 i 1 n p total communication time: CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

n-body Problem Analysis of execution time (per time instant): C: time to compute new particle position and velocity D: time to initiate message B: (bandwidth) number of data units that can be sent in one unit of time p: number of processors Sequential algorithm: Cn(n 1) Parallel algorithm: Computation time: C n p (n 1) Communication time: one message with k data units: D + k B depth of tree: log p length of message at depth i: 2 i 1 n p total communication time: ( ) log p i=1 D + 2i 1 n pb ( ) Total: D log p + n p 1 p B + C(n 1) = D log p + n(p 1) pb CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 23 / 25

Review Foster s design methodology partitioning communication agglomeration mapping Application Examples Boundary value problem Finding the maximum n-body problem CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 24 / 25

Next Class MPI CPD (DEI / IST) Parallel and Distributed Computing 11 2011-10-19 25 / 25