Message-Passing Thought Exercise
|
|
- Margery Andrea Jenkins
- 6 years ago
- Views:
Transcription
1 Message-Passing Thought Exercise Traffic Modelling David Henty EPCC
2 traffic flow 15 May 2015 Traffic Modelling 2 we want to predict traffic flow to look for effects such as congestion build a computer model
3 simple traffic model divide road into a series of cells either occupied or unoccupied perform a number of steps each step, cars move forward if space ahead is empty could do this by moving pawns on a chess board 15 May 2015 Traffic Modelling 3
4 traffic behaviour 15 May 2015 Traffic Modelling 4 model predicts a number of interesting features traffic lights congestion average speed 1.0 more complicated models are used in practice % 50% 100% density of cars
5 how fast can we run the model? measure speed in Car Operations Per second how many COPs? around 2 COPs but what about three people? can they do six COPs? 15 May 2015 Traffic Modelling 5
6 a parallel traffic model 15 May 2015 Traffic Modelling 6 A S B C
7 State Table If R t (i) = 0, then R t+1 (i) is given by: R t (i-1) = 0 R t (i -1) = 1 R t (i+1) = R t (i+1) = If R t (i) = 1, then R t+1 (i) is given by: R t (i-1) = 0 R t (i -1) = 1 R t (i+1) = R t (i+1) = HPC Concepts
8 Pseudo Code (serial) 8 HPC Concepts declare arrays old(i) and new(i), i = 0,1,...,N,N+1 initialise old(i) for i = 1,2,...,N-1,N (eg randomly) loop over iterations set old(0) = old(n) and set old(n+1) = old(1) loop over i = 1,...,N if old(i) = 1 if old(i+1) = 1 then new(i) = 1 else new(i) = 0 if old(i) = 0 if old(i-1) = 1 then new(i) = 1 else new(i) = 0 end loop over i set old(i) = new(i) for i = 1,2,...,N-1,N end loop over iterations
9 Pseudo Code (serial with subroutines) 9 HPC Concepts declare arrays old(i) and new(i), i = 0,1,...,N,N+1 initialise old(i) for i = 1,2,...,N-1,N (eg randomly) loop over iterations! Implement boundary conditions set old(0) = old(n) and set old(n+1) = old(1)! Update road call newroad(new, old, N)! Prepare for next iteration set old(i) = new(i) for i = 1,2,...,N-1,N end loop over iterations
10 Pseudo Code (distributed memory) 10 HPC Concepts! assume we are running on P processes declare arrays old(i) and new(i), i = 0,1,...,N/P,N/P+1 initialise old(i) for i = 1,2,...,N/P-1,N/P (eg randomly) loop over iterations! Implement boundary conditions (processes arranged as a ring) set old(0) on this process to old(n/p) from previous process set old(n/p+1) on this process to old(1) from next process! Update road call newroad(new, old, N/P)! Prepare for next iteration set old(i) = new(i) for i = 1,2,...,N/P-1,N/P end loop over iterations
11 Halo swapping 15 May 2015 Traffic Modelling 11! Implement boundary conditions set old(0) on this process to old(n/p) from previous process set old(n/p+1) on this process to old(1) from next process Implement this using blocking receives (e.g. MPI_Recv) and: or or synchronous send (routine blocks until message is received) e.g. MPI_Ssend asynchronous send (message copied into buffer, returns straight away) e.g. MPI_Bsend non-blocking synchronous send (no buffering but immediate return) e.g. MPI_Issend / MPI_Wait
12 Synchronous sends 15 May 2015 Traffic Modelling 12! Implement boundary conditions Ssend(old(N/P), up) Recv (old(1), down) Ssend(old(1), down) Recv (old(n/p+1), up) Guaranteed to deadlock
13 Asynchronous (buffered) sends! Implement boundary conditions Bsend(old(N/P), Recv (old(1), Bsend(old(1), up) down) down) Recv (old(n/p+1), up) Where do synchronisation issues become important? call newroad(new, old, N/P)? OK because we are writing new but only reading old set old(i) = new(i)? only OK because Bsend has copied old(1) and old(n/p) We don t really care if/when the message is received we do really care if/when we can safely reuse the local send buffers 15 May 2015 Traffic Modelling 13
14 Non-blocking (immediate) sends 15 May 2015 Traffic Modelling 14! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)
15 Non-blocking (immediate) sends 15 May 2015 Traffic Modelling 15! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)! Wait for communications to complete before next iteration wait(up) wait(down)
16 15 May 2015 Traffic Modelling 16 Non-blocking (immediate) sends! Implement boundary conditions Issend(old(N/P), Recv (old(1), Issend(old(1), up) down) down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)! Wait for communications to complete before next iteration wait(up) wait(down) Incorrect! overwriting old is the key issue need to know boundary values of old are sent before overwriting
17 Non-blocking sends: correct 15 May 2015 Traffic Modelling 17! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) wait(up) wait(down) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)
18 Delaying the waits 15 May 2015 Traffic Modelling 18! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 2,3,...,N/P-1) wait(up) old(n/p = new(m/p) wait(down) old(1) = new(1)
19 15 May 2015 Traffic Modelling 19 RMA synchronisation Similar synchronisation issues to non-blocking message passing but worse!
20 Remote Memory Access 20 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote read first old(0) = old(n/p) from previous process old(n/p+1) = old(1) from next process call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P
21 Remote Memory Access 21 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote read first old(0) old(n/p+1) = old(1) = old(n/p) from previous process call newroad(new, old, N/P) from next process! synchronise to ensure my old values have all been read set old(i) = new(i) for i = 1,2,...,N/P-1,N/P! synchronise to ensure neighbours old values have been! updated before I read them on the next iteration assuming reads are blocking like Recv
22 Remote Memory Access 22 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P
23 Remote Memory Access 23 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1)! synchronise to ensure my halos on old have been updated call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P
24 Remote Memory Access 24 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1)! synchronise to ensure my halos on old have been updated call newroad(new, old, N/P) assuming writes behave like a Bsend set old(i) = new(i) for i = 1,2,...,N/P-1,N/P! synchronise to ensure my neighbours have finished with their! old arrays (in newroad ) before overwriting them
25 Summary 25 HPC Concepts Synchronisation in PGAS approaches is not simple easy to write programs with subtle synchronisation errors Think first, code later!
Further MPI Programming. Paul Burton April 2015
Further MPI Programming Paul Burton April 2015 Blocking v Non-blocking communication Blocking communication - Call to MPI sending routine does not return until the send buffer (array) is safe to use again
More informationMPI Casestudy: Parallel Image Processing
MPI Casestudy: Parallel Image Processing David Henty 1 Introduction The aim of this exercise is to write a complete MPI parallel program that does a very basic form of image processing. We will start by
More informationNon-Blocking Communications
Non-Blocking Communications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationNon-Blocking Communications
Non-Blocking Communications Deadlock 1 5 2 3 4 Communicator 0 2 Completion The mode of a communication determines when its constituent operations complete. - i.e. synchronous / asynchronous The form of
More informationExercises: Message-Passing Programming
T E H U I V E R S I T Y O H F R G E I B U Exercises: Message-Passing Programming Hello World avid Henty. Write an MPI program which prints the message Hello World.. Compile and run on several processes
More informationMessage Passing Programming. Modes, Tags and Communicators
Message Passing Programming Modes, Tags and Communicators Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationMPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh
MPI Optimisation Advanced Parallel Programming David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh Overview Can divide overheads up into four main categories: Lack of parallelism Load imbalance
More informationExercises: Message-Passing Programming
T H U I V R S I T Y O H F R G I U xercises: Message-Passing Programming Hello World avid Henty. Write an MPI program which prints the message Hello World. ompile and run on one process. Run on several
More informationParallel Programming
Parallel Programming Point-to-point communication Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 18/19 Scenario Process P i owns matrix A i, with i = 0,..., p 1. Objective { Even(i) : compute Ti
More informationCOSC 6374 Parallel Computation
COSC 6374 Parallel Computation Message Passing Interface (MPI ) II Advanced point-to-point operations Spring 2008 Overview Point-to-point taxonomy and available functions What is the status of a message?
More informationMessage Passing Programming. Designing MPI Applications
Message Passing Programming Designing MPI Applications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationMessage Passing Programming. Modes, Tags and Communicators
Message Passing Programming Modes, Tags and Communicators Overview Lecture will cover - explanation of MPI modes (Ssend, Bsend and Send) - meaning and use of message tags - rationale for MPI communicators
More informationDistributed Memory Programming With MPI (4)
Distributed Memory Programming With MPI (4) 2014 Spring Jinkyu Jeong (jinkyu@skku.edu) 1 Roadmap Hello World in MPI program Basic APIs of MPI Example program The Trapezoidal Rule in MPI. Collective communication.
More informationExercises: Message-Passing Programming
T H U I V R S I T Y O H F R G I U xercises: Message-Passing Programming Hello World avid Henty. Write an MPI program which prints the message Hello World. ompile and run on one process. Run on several
More informationMessage-Passing Programming with MPI
Message-Passing Programming with MPI Message-Passing Concepts David Henty d.henty@epcc.ed.ac.uk EPCC, University of Edinburgh Overview This lecture will cover message passing model SPMD communication modes
More informationParallel Short Course. Distributed memory machines
Parallel Short Course Message Passing Interface (MPI ) I Introduction and Point-to-point operations Spring 2007 Distributed memory machines local disks Memory Network card 1 Compute node message passing
More informationCSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC)
Parallel Algorithms on a cluster of PCs Ian Bush Daresbury Laboratory I.J.Bush@dl.ac.uk (With thanks to Lorna Smith and Mark Bull at EPCC) Overview This lecture will cover General Message passing concepts
More informationAcknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text
Acknowledgments Programming with MPI Basic send and receive Jan Thorbecke Type to enter text This course is partly based on the MPI course developed by Rolf Rabenseifner at the High-Performance Computing-Center
More informationProgramming with MPI Basic send and receive
Programming with MPI Basic send and receive Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Acknowledgments This course is partly based on the MPI course developed
More informationSINGLE-SIDED PGAS COMMUNICATIONS LIBRARIES. Basic usage of OpenSHMEM
SINGLE-SIDED PGAS COMMUNICATIONS LIBRARIES Basic usage of OpenSHMEM 2 Outline Concept and Motivation Remote Read and Write Synchronisation Implementations OpenSHMEM Summary 3 Philosophy of the talks In
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Point-to-Point Transfers Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Digression Most books and courses teach point--to--point first
More informationMPI Visualisation. Thomas RICHARD. Academic year 2010/2011
MPI Visualisation Thomas RICHARD Academic year 2010/2011 MSc in High Performance Computing Edinburgh Parallel Computing Centre The University of Edinburgh Year of Presentation: 2011 Abstract The MPI Visualisation
More informationCS450 - Structure of Higher Level Languages
Spring 2018 Streams February 24, 2018 Introduction Streams are abstract sequences. They are potentially infinite we will see that their most interesting and powerful uses come in handling infinite sequences.
More informationCSC D70: Compiler Optimization Register Allocation
CSC D70: Compiler Optimization Register Allocation Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons
More informationCSE 160 Lecture 18. Message Passing
CSE 160 Lecture 18 Message Passing Question 4c % Serial Loop: for i = 1:n/3-1 x(2*i) = x(3*i); % Restructured for Parallelism (CORRECT) for i = 1:3:n/3-1 y(2*i) = y(3*i); for i = 2:3:n/3-1 y(2*i) = y(3*i);
More informationMessage-Passing and MPI Programming
Message-Passing and MPI Programming More on Point-to-Point 6.1 Introduction N.M. Maclaren nmm1@cam.ac.uk July 2010 These facilities are the most complicated so far, but you may well want to use them. There
More informationAgent Design Example Problems State Spaces. Searching: Intro. CPSC 322 Search 1. Textbook Searching: Intro CPSC 322 Search 1, Slide 1
Searching: Intro CPSC 322 Search 1 Textbook 3.0 3.3 Searching: Intro CPSC 322 Search 1, Slide 1 Lecture Overview 1 Agent Design 2 Example Problems 3 State Spaces Searching: Intro CPSC 322 Search 1, Slide
More informationPoint-to-Point Communication. Reference:
Point-to-Point Communication Reference: http://foxtrot.ncsa.uiuc.edu:8900/public/mpi/ Introduction Point-to-point communication is the fundamental communication facility provided by the MPI library. Point-to-point
More informationStandard MPI - Message Passing Interface
c Ewa Szynkiewicz, 2007 1 Standard MPI - Message Passing Interface The message-passing paradigm is one of the oldest and most widely used approaches for programming parallel machines, especially those
More informationProgramming Scalable Systems with MPI. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SurfSARA High Performance Computing and Big Data Course June 2014 Parallel Programming with Compiler Directives: OpenMP Message Passing Gentle Introduction
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI One-sided Communication Nick Maclaren nmm1@cam.ac.uk October 2010 Programming with MPI p. 2/?? What Is It? This corresponds to what is often called RDMA
More informationPractical Scientific Computing: Performanceoptimized
Practical Scientific Computing: Performanceoptimized Programming Programming with MPI November 29, 2006 Dr. Ralf-Peter Mundani Department of Computer Science Chair V Technische Universität München, Germany
More informationWriting Message Passing Parallel Programs with MPI
Writing Message Passing Parallel Programs with MPI A Two Day Course on MPI Usage Course Notes Version 1.8.2 Neil MacDonald, Elspeth Minty, Joel Malard, Tim Harding, Simon Brown, Mario Antonioletti Edinburgh
More informationParallel Programming with Coarray Fortran
Parallel Programming with Coarray Fortran SC10 Tutorial, November 15 th 2010 David Henty, Alan Simpson (EPCC) Harvey Richardson, Bill Long, Nathan Wichmann (Cray) Tutorial Overview The Fortran Programming
More informationMPI Case Study. Fabio Affinito. April 24, 2012
MPI Case Study Fabio Affinito April 24, 2012 In this case study you will (hopefully..) learn how to Use a master-slave model Perform a domain decomposition using ghost-zones Implementing a message passing
More informationStepwise Refinement. Lecture 12 COP 3014 Spring February 2, 2017
Stepwise Refinement Lecture 12 COP 3014 Spring 2017 February 2, 2017 Top-Down Stepwise Refinement Top down stepwise refinement is a useful problem-solving technique that is good for coming up with an algorithm.
More informationHigh Performance Computing
High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming II 1 Communications Point-to-point communications: involving exact two processes, one sender and one receiver For example,
More informationWelfare Navigation Using Genetic Algorithm
Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 11/12/2015 Advanced MPI 2 One
More informationCS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting
CS2 Algorithms and Data Structures Note 10 Depth-First Search and Topological Sorting In this lecture, we will analyse the running time of DFS and discuss a few applications. 10.1 A recursive implementation
More informationDiscussion: MPI Basic Point to Point Communication I. Table of Contents. Cornell Theory Center
1 of 14 11/1/2006 3:58 PM Cornell Theory Center Discussion: MPI Point to Point Communication I This is the in-depth discussion layer of a two-part module. For an explanation of the layers and how to navigate
More informationDATA MINING I - CLUSTERING - EXERCISES
EPFL ENAC TRANSP-OR Prof. M. Bierlaire Gael Lederrey & Nikola Obrenovic Decision Aids Spring 2018 DATA MINING I - CLUSTERING - EXERCISES Exercise 1 In this exercise, you will implement the k-means clustering
More informationCPS 310 midterm exam #1, 2/19/2016
CPS 310 midterm exam #1, 2/19/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. For the execution tracing problem (P3) you may wish to use
More informationProgramming with Message Passing PART I: Basics. HPC Fall 2012 Prof. Robert van Engelen
Programming with Message Passing PART I: Basics HPC Fall 2012 Prof. Robert van Engelen Overview Communicating processes MPMD and SPMD Point-to-point communications Send and receive Synchronous, blocking,
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 17/18 Exercise MPI_Irecv MPI_Wait ==?? MPI_Recv Paolo Bientinesi MPI 2 Exercise MPI_Irecv MPI_Wait ==?? MPI_Recv ==?? MPI_Irecv
More informationConcurrent & Distributed Systems Supervision Exercises
Concurrent & Distributed Systems Supervision Exercises Stephen Kell Stephen.Kell@cl.cam.ac.uk November 9, 2009 These exercises are intended to cover all the main points of understanding in the lecture
More informationEMF Temporality. Jean-Claude Coté Éric Ladouceur
EMF Temporality Jean-Claude Coté Éric Ladouceur 1 Introduction... 3 1.1 Dimensions of Time... 3 3 Proposed EMF implementation... 4 3.1 Modeled Persistence... 4 3.2 Modeled Temporal API... 5 3.2.1 Temporal
More informationIntroduction to parallel computing concepts and technics
Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing
More informationFramework of an MPI Program
MPI Charles Bacon Framework of an MPI Program Initialize the MPI environment MPI_Init( ) Run computation / message passing Finalize the MPI environment MPI_Finalize() Hello World fragment #include
More informationParallel Numerics. 1 Data Dependency Graphs & DAGs. Exercise 3: Vector/Vector Operations & P2P Communication II
Technische Universität München WiSe 2014/15 Institut für Informatik Prof. Dr. Thomas Huckle Dipl.-Inf. Christoph Riesinger Sebastian Rettenberger, M.Sc. Parallel Numerics Exercise 3: Vector/Vector Operations
More informationProgramming Scalable Systems with MPI. UvA / SURFsara High Performance Computing and Big Data. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SURFsara High Performance Computing and Big Data Message Passing as a Programming Paradigm Gentle Introduction to MPI Point-to-point Communication Message Passing
More informationSlides prepared by : Farzana Rahman 1
Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based
More informationMessage Passing Interface. George Bosilca
Message Passing Interface George Bosilca bosilca@icl.utk.edu Message Passing Interface Standard http://www.mpi-forum.org Current version: 3.1 All parallelism is explicit: the programmer is responsible
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 22/02/2017 Advanced MPI 2 One
More informationThis lecture presents ordered lists. An ordered list is one which is maintained in some predefined order, such as alphabetical or numerical order.
6.1 6.2 This lecture presents ordered lists. An ordered list is one which is maintained in some predefined order, such as alphabetical or numerical order. A list is numerically ordered if, for every item
More informationCache Coherence Tutorial
Cache Coherence Tutorial The cache coherence protocol described in the book is not really all that difficult and yet a lot of people seem to have troubles when it comes to using it or answering an assignment
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 16/17 Point-to-point communication Send MPI_Ssend MPI_Send MPI_Isend. MPI_Bsend Receive MPI_Recv MPI_Irecv Paolo Bientinesi MPI
More informationRSX Best Practices. Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog
RSX Best Practices Mark Cerny, Cerny Games David Simpson, Naughty Dog Jon Olick, Naughty Dog RSX Best Practices About libgcm Using the SPUs with the RSX Brief overview of GCM Replay December 7 th, 2004
More informationParallel Programming
Parallel Programming for Multicore and Cluster Systems von Thomas Rauber, Gudula Rünger 1. Auflage Parallel Programming Rauber / Rünger schnell und portofrei erhältlich bei beck-shop.de DIE FACHBUCHHANDLUNG
More informationCSE 410 Final Exam Sample Solution 6/08/10
Question 1. (12 points) (caches) (a) One choice in designing cache memories is to pick a block size. Which of the following do you think would be the most reasonable size for cache blocks on a computer
More informationParallel Processing. Parallel Processing. 4 Optimization Techniques WS 2018/19
Parallel Processing WS 2018/19 Universität Siegen rolanda.dwismuellera@duni-siegena.de Tel.: 0271/740-4050, Büro: H-B 8404 Stand: September 7, 2018 Betriebssysteme / verteilte Systeme Parallel Processing
More informationLaplace Exercise Solution Review
Laplace Exercise Solution Review John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2017 Finished? If you have finished, we can review a few principles that you have inevitably
More informationThere are a number of ways of doing this, and we will examine two of them. Fig1. Circuit 3a Flash two LEDs.
Flashing LEDs We re going to experiment with flashing some LEDs in this chapter. First, we will just flash two alternate LEDs, and then make a simple set of traffic lights, finally a running pattern; you
More informationComputer Architecture
Boaz Kantor Introduction to Computer Science, Fall semester 2010-2011 IDC Herzliya Computer Architecture I know what you're thinking, 'cause right now I'm thinking the same thing. Actually, I've been thinking
More informationConcurrent/Parallel Processing
Concurrent/Parallel Processing David May: April 9, 2014 Introduction The idea of using a collection of interconnected processing devices is not new. Before the emergence of the modern stored program computer,
More informationThe MPI Message-passing Standard Lab Time Hands-on. SPD Course Massimo Coppola
The MPI Message-passing Standard Lab Time Hands-on SPD Course 2016-2017 Massimo Coppola Remember! Simplest programs do not need much beyond Send and Recv, still... Each process lives in a separate memory
More informationScaling Tuple-Space Communication in the Distributive Interoperable Executive Library. Jason Coan, Zaire Ali, David White and Kwai Wong
Scaling Tuple-Space Communication in the Distributive Interoperable Executive Library Jason Coan, Zaire Ali, David White and Kwai Wong August 18, 2014 Abstract The Distributive Interoperable Executive
More informationInvestigating the Performance of Asynchronous Jacobi s Method for Solving Systems of Linear Equations
Investigating the Performance of Asynchronous Jacobi s Method for Solving Systems of Linear Equations Bethune, Iain and Bull, J. Mark and Dingle, Nicholas J. and Higham, Nicholas J. 2011 MIMS EPrint: 2011.82
More informationCS4961 Parallel Programming. Lecture 19: Message Passing, cont. 11/5/10. Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM
Parallel Programming Lecture 19: Message Passing, cont. Mary Hall November 4, 2010 Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM Today we will cover Successive Over Relaxation.
More informationParallelization. Tianhe-1A, 2.45 Pentaflops/s, 224 Terabytes RAM. Nigel Mitchell
Parallelization Tianhe-1A, 2.45 Pentaflops/s, 224 Terabytes RAM Nigel Mitchell Outline Pros and Cons of parallelization Shared memory vs. cluster computing MPI as a tool for sending and receiving messages
More informationLock Ahead: Shared File Performance Improvements
Lock Ahead: Shared File Performance Improvements Patrick Farrell Cray Lustre Developer Steve Woods Senior Storage Architect woods@cray.com September 2016 9/12/2016 Copyright 2015 Cray Inc 1 Agenda Shared
More informationAn Extension of XcalableMP PGAS Lanaguage for Multi-node GPU Clusters
An Extension of XcalableMP PGAS Lanaguage for Multi-node Clusters Jinpil Lee, Minh Tuan Tran, Tetsuya Odajima, Taisuke Boku and Mitsuhisa Sato University of Tsukuba 1 Presentation Overview l Introduction
More informationPROVING THINGS ABOUT PROGRAMS
PROVING THINGS ABOUT CONCURRENT PROGRAMS Lecture 23 CS2110 Fall 2010 Overview 2 Last time we looked at techniques for proving things about recursive algorithms We saw that in general, recursion matches
More informationOptimization of MPI Applications Rolf Rabenseifner
Optimization of MPI Applications Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Optimization of MPI Applications Slide 1 Optimization and Standardization
More informationParallel Computing Paradigms
Parallel Computing Paradigms Message Passing João Luís Ferreira Sobral Departamento do Informática Universidade do Minho 31 October 2017 Communication paradigms for distributed memory Message passing is
More informationDefinition: A data structure is a way of organizing data in a computer so that it can be used efficiently.
The Science of Computing I Lesson 4: Introduction to Data Structures Living with Cyber Pillar: Data Structures The need for data structures The algorithms we design to solve problems rarely do so without
More informationAn introduction to MPI
An introduction to MPI C MPI is a Library for Message-Passing Not built in to compiler Function calls that can be made from any compiler, many languages Just link to it Wrappers: mpicc, mpif77 Fortran
More informationProcess Synchronisation (contd.) Deadlock. Operating Systems. Spring CS5212
Operating Systems Spring 2009-2010 Outline Process Synchronisation (contd.) 1 Process Synchronisation (contd.) 2 Announcements Presentations: will be held on last teaching week during lectures make a 20-minute
More informationHigh-Performance Computing: MPI (ctd)
High-Performance Computing: MPI (ctd) Adrian F. Clark: alien@essex.ac.uk 2015 16 Adrian F. Clark: alien@essex.ac.uk High-Performance Computing: MPI (ctd) 2015 16 1 / 22 A reminder Last time, we started
More informationCS244 Advanced Topics in Computer Networks Midterm Exam Monday, May 2, 2016 OPEN BOOK, OPEN NOTES, INTERNET OFF
CS244 Advanced Topics in Computer Networks Midterm Exam Monday, May 2, 2016 OPEN BOOK, OPEN NOTES, INTERNET OFF Your Name: Answers SUNet ID: root @stanford.edu In accordance with both the letter and the
More informationCOMP Data Structures
COMP 2140 - Data Structures Shahin Kamali Topic 5 - Sorting University of Manitoba Based on notes by S. Durocher. COMP 2140 - Data Structures 1 / 55 Overview Review: Insertion Sort Merge Sort Quicksort
More informationTeaching London Computing
Teaching London Computing A Level Computer Science Topic 1: GCSE Python Recap William Marsh School of Electronic Engineering and Computer Science Queen Mary University of London Aims What is programming?
More informationME964 High Performance Computing for Engineering Applications
ME964 High Performance Computing for Engineering Applications Parallel Computing with MPI Building/Debugging MPI Executables MPI Send/Receive Collective Communications with MPI April 10, 2012 Dan Negrut,
More informationAdvanced Parallel Programming
Advanced Parallel Programming Derived Datatypes Dr David Henty HPC Training and Support Manager d.henty@epcc.ed.ac.uk +44 131 650 5960 16/01/2014 MPI-IO 2: Derived Datatypes 2 Overview Lecture will cover
More informationComputer Architecture
Boaz Kantor Introduction to Computer Science, Fall semester 2009-2010 IDC Herzliya Computer Architecture I know what you're thinking, 'cause right now I'm thinking the same thing. Actually, I've been thinking
More informationExamples of P vs NP: More Problems
Examples of P vs NP: More Problems COMP1600 / COMP6260 Dirk Pattinson Australian National University Semester 2, 2017 Catch Up / Drop in Lab When Fridays, 15.00-17.00 Where N335, CSIT Building (bldg 108)
More informationOptimising for the p690 memory system
Optimising for the p690 memory Introduction As with all performance optimisation it is important to understand what is limiting the performance of a code. The Power4 is a very powerful micro-processor
More information4 Algorithms. Definition of an Algorithm
4 Algorithms Definition of an Algorithm An algorithm is the abstract version of a computer program. In essence a computer program is the codification of an algorithm, and every algorithm can be coded as
More informationQUIZ. What is wrong with this code that uses default arguments?
QUIZ What is wrong with this code that uses default arguments? Solution The value of the default argument should be placed in either declaration or definition, not both! QUIZ What is wrong with this code
More informationLecture 10 Exceptions and Interrupts. How are exceptions generated?
Lecture 10 Exceptions and Interrupts The ARM processor can work in one of many operating modes. So far we have only considered user mode, which is the "normal" mode of operation. The processor can also
More informationACTIVITY 2: Reflection of Light
UNIT L Developing Ideas ACTIVITY 2: Reflection of Light Purpose Most people realize that light is necessary to see things, like images in mirrors, and various kinds of objects. But how does that happen?
More informationò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file?
Big picture (from Sandberg et al.) NFS Don Porter CSE 506 Intuition Challenges Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server Simple,
More informationAdvanced Features. SC10 Tutorial, November 15 th Parallel Programming with Coarray Fortran
Advanced Features SC10 Tutorial, November 15 th 2010 Parallel Programming with Coarray Fortran Advanced Features: Overview Execution segments and Synchronisation Non-global Synchronisation Critical Sections
More informationNFS. Don Porter CSE 506
NFS Don Porter CSE 506 Big picture (from Sandberg et al.) Intuition ò Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server ò Simple, right?
More information7. Traffic Simulation via Cellular Automata
EX7CellularAutomataForTraffic.nb 1 7. Traffic Simulation via Cellular Automata From Chapter 12 (mainly) of "Computer Simulations with Mathematica", Gaylord & Wellin, Springer 1995 [DCU library ref 510.2855GAY]
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms )
CSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012 2D Heat Diffusion
More informationTree Search for Travel Salesperson Problem Pacheco Text Book Chapt 6 T. Yang, UCSB CS140, Spring 2014
Tree Search for Travel Salesperson Problem Pacheco Text Book Chapt 6 T. Yang, UCSB CS140, Spring 2014 Outline Tree search for travel salesman problem. Recursive code Nonrecusive code Parallelization with
More information4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING
4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING 4.1.2 ALGORITHMS ALGORITHM An Algorithm is a procedure or formula for solving a problem. It is a step-by-step set of operations to be performed. It is almost
More information