Parallel And Distributed Compilers
Parallel Systems Parallel System has the goal to solve a given problem using multiple processors. Normally Solve a single problem These typically are used in applications that take a huge amount of compute time, on a single processor.
Distributed Systems Distributed System also use multiple processors but the tasks of many users are dealt with. There is no single problem involved. The processors here are connected by network and autonomous. More cost effective, fault tolerant, and easier to extend. A distributed system can be used as a parallel one.
MultiProcessors and MultiComputers An important issue in compiling the parallel and distributed programs is the architecture of the target system Multiprocessor Consists of several processors connected by a single bus All procesors have single shared memory Processes are tightly coupled MultiComputer Consists of Several processors connected by a network Run in different Address Space Processors are loosely coupled
MultiProcessor CPU CPU CPU Shared Memory
MultiProcessor Mem CPU Mem CPU Mem CPU Network
Parallel Programming Models These provide support for expressing parallelism as well as communication and synchronization between parallel tasks. Shared Variables Message Passing Objects Tuple Space Data Parallel Programming Low-level communication mechanisms that reflect the underlying hardware High level constructs that allow communication at higher level of abstraction
Shared Variables This is the simplest model of communication for the processes In this model, some part of address space of the processes overlap with each other, so that multiple processes can access same variables. These variables are called as shared variables. These variables serve as a mechanism of communication between the processes.
Shared Variables Here, the major problem is the synchronizing access to shared variables. If multiple process simultaneously try to change the same data, the result may be unpredictable and the data structure may be left in an inconsistent state. Consider two processes that simultaneously try to increment a shared variable X by executing : X : shared integer; X = X + 1; Consider X to have value 5. If both processes increment the variable, the resulting value of X should be 7. However, if both processes read value 5 and compute the new value, both assign value 6 to X, instead of
Shared Variables : Lock Variables To prevent such undesirable behavior, synchronization primitives are needed to make sure only one process can access a certain shared variable. This form of synchronization is called mutual exclusion synchronization. Simple such primitive is lock variable, which has indivisible operations to acquire and release the lock..if a process acquires a lock that has already been acquired, this process blocks until the lock is released. X : shared integer ; X_lock : lock Acquire_Lock(X_lock); X := X+1; Release _Lock(X_Lock); Due, to this only one process can execute the increment statement at any given time. However, this method is error prone i.e if a thread fails to get a lock, it will block until lock becomes free.
Shared Variables : Monitor Monitors provide control by allowing only one process to access a critical resource at a time A class/module/package Contains procedures and data
An Abstract Monitor name : monitor some local declarations initialize local data procedure name( arguments) do some work other procedures
Monitor Rules Any process can access any monitor procedure at any time Only one process may enter a monitor procedure No process may directly access a monitor s local variables A monitor may only access it s local variables
Things Needed to Enforce Monitor wait operation Forces running process to sleep signal operation Wakes up a sleeping process A condition Something to store who s waiting for a particular reason Implemented as a queue
A Running Example Monitor BinMonitor; occupied : Boolean := false; full, empty : condition; procedure put(x: integer) while occupied do wait(empty); od bin = x; occupied = true; signal(full); end; operation get (x: out integer) while not occupied do wait(full); od x = bin; occupied = false; signal(empty) ; end; end; Declarations / Initialization Procedure Monitor Declaration Procedure
Summary : Monitor Advantages Data access synchronization simplified (vs. semaphores or locks) Better encapsulation Disadvantages: Deadlock still possible (in monitor code) Programmer can still damage use of monitors No provision for information exchange between machines
Message Passing Programming Model Shared data is communicated using send / receive services (across an external network). Unlike Shared Model, shared data must be formatted into message chunks for distribution (shared model works no matter how the data is intermixed). Coordination is via sending/receiving messages. Program components can be run on the same or different systems, so can use 1,000s of processors. Standard libraries exist to encapsulate messages: like Express, Parallel Virtual Machine( PVM), Message Passing Interface
Message Passing Issues Synchronization semantics When does a send /receive operation terminate? Blocking (aka Synchronous): Sender waits until its message is received Receiver waits if no message is available Sender OS Kernel Receiver Non-blocking (aka Asynchronous): Send operation immediately returns Receive operation returns if no message is available (polling) Partially blocking/non-blocking: send()/receive() with timeout Sender OS Kernel How many buffers? Receiver
Objects Key idea of OOP is to encapsulate the data in objects. Data inside an object can only be accessed through operations( or methods) Parallelism is introduced by allowing several objects to execute at the same time. Possibly on different processors An object can communicate with other by method invocation. Many parallel OOL allow the process inside an object to consist of multiple threads of control. Synchronization of these threads is done by monitor.
Tuple Spaces A flexible technique for parallel and distributed computing Similar to message passing Data exists in a tuple space All processors can access the space
View of a Tuple Space Process 1 Tuple Space (req, P, 7) (A, 77) (rsp, Q, 8.1) (B) Out(B) Process 3 In(X, 77) Out(rsp, Q, 8.1) Process 2 Tuple t is inserted into TS using Out(t) A tuple t is removed from tuple space using In(t)
Tuple Types: Simple Case A tuple is inserted into TS using Out(P, x, y, z) (assume x, y, z integers) The tuple is removed from TS using In(P, a:integer, b:integer, c:integer) The result: a=x, b=y, c=z P is the name of the tuple Also have Read, similar to In, but tuple is not removed from TS
Formal vs. Actual Parameters Parameters of the form p:t are formal parameters Other parameters are actual parameters In and Out accept both formal and actual parameters Op(Req, 77.2, i:integer, true, s:string)
Data Parallelism All processes execute same algorithm, but operate on different parts of a data set, usually an array. More restrictive and less flexible