Approaches to Parallel Computing

Size: px

Start display at page:

Download "Approaches to Parallel Computing"

Lydia Conley
5 years ago
Views:

1 Approaches to Parallel Computing K. Cooper 1 1 Department of Mathematics Washington State University 2019

2 Paradigms Concept Many hands make light work... Set several processors to work on separate aspects of a problem. Simulation: One program, different data, no communication. Master-Slave: One program, sends small tasks to many subprocesses. Communication only to/from master process. Multiple Instruction Streams: Separate programs running on many processors. Communication among processes via messages.

3 Paradigms Single Instruction, Multiple Data Single CPU with many ALUs ID step fills registers for each ALU EX step does computation simultaneously on all ALUs

4 Paradigms SIMD Pipeline... After instructions are decoded, the same operations can be executed on a vector array of numbers.

5 Paradigms Disadvantages Specialized architecture Slower to fill ALU registers bottleneck Many ALUs idle during EX

6 Paradigms Single Instruction, Multiple Thread Many CPUs Main program spins many threads for one instruction Examples Python parallel package uses several cores of a CPU CUDA computing use hundreds or thousands of cores of a GPU Conjecture: This is only efficient with many many cores.

7 MIMD Multiple Instruction, Multiple Data Many CPUs Asynchronous Redundant work Much more versatile than most SIMD

8 MIMD Shared Memory E.g. Quad Core CPU Bus-based Limited bandwidth on FSB Scales poorly Switch-based Expensive Still does not scale well communication bottleneck

9 MIMD Distributed Memory Each node adds memory to system. Maybe no single node sees entire problem. E.g. Beowulf cluster Each CPU requires its own dedicated memory Could be separate sectors in single RAM... Could be separate machines Communication becomes a roadblock

10 MIMD Distributed Memory MIMD

11 MIMD Message Passing Typically, each instruction stream starts identically Each processor starts with same code Processes perform different tasks based on rank I/O to processes is performed through messages

12 MIMD Interconnection Network Front side bus Infiniband Ethernet - slow

13 MIMD SPMD SPMD Single Program, Multiple Data You write one (1) program......that program runs on every processor Instances Processes perform tasks based on conditions and messages Processes have different inputs, outputs

14 MIMD SPMD Nomenclature Node A computer connected to a head machine by some means Interconnect The means of connecting the nodes Core A single processor on one of the CPUs of a node Processor Usually means a core Process A program that runs on a processor. Possibly (but not desirably) many processes per processor.

15 Summary Summary When CPUs were expensive: Pipelines As chips became denser SIMD As CPUs become commodities: MIMD As GPUs become dense: GPU

16 Summary Goal Hope to show that we can modify programs easily to take advantage of modern processors Getting speedups is more problematic

17 Summary Resources Solitary - Two cpus, four cores each, 8GB RAM runs prime1 on 6 cores in.038 seconds on OpenMPI Cluster - Five nodes, one cpu per node, six cores per cpu, 8GB RAM per node runs prime1 on 6 cores in.041 seconds on MPICH2 Labs - 20 to 32 nodes, two to eight cores per node

Parallel Computing Ideas

Parallel Computing Ideas K. 1 1 Department of Mathematics 2018 Why When to go for speed Historically: Production code Code takes a long time to run Code runs many times Code is not end in itself 2010: