University of Pune S.E. I.T. Subject code: 214442 Computer Organization Part 44 Cluster Processors, UMA, NUMA UNIT VI Tushar B. Kute, Department of Information Technology, Sandip Institute of Technology & Research Centre, Nashik. http://tusharkute.com Clusters Computer cluster is a group of linked computers, working together closely so that in many respect they form a single computer. The components of a cluster are commonly but not always connected to each other through fast LAN. Computer means a system that run its own, a part from the cluster. Such a computer in cluster is typically referred as a node.
Advantages of clustering Absolute scalability Incremental scalability High availability Cost effective Cluster configurations
Cluster configurations Homogenous clusters Every single node is exactly the same,
Heterogeneous Cluster Made from different kinds of computers. For example: a few Sun SPARC station IPXs, a few Intel 486 machines, and a DEC alpha. Made from different machines in the same architecture family. For example: a collection of Intel boxes where the machines are of different generations such as mixture of 486, Pentium I, and Pentium II. Operating System Design Issues Failure management Load balancing Parallelizing computation Parallelizing compiler Parallelized applications Parametric computing
Cluster Computer Architecture Cluster middleware services and functions Single entry point Single file hierarchy Single control unit Single virtual networking Single memory space Single job management system Single I/O space Single Process Space Check Pointing Process Migration
Comparison Uniform Memory Access It is a shared memory architecture used in parallel computers. All the processors in the UMA model share physical memory uniformly. In a UMA architecture, access time to memory location is independent of which processor makes the request or which memory chip contains the transferred data.
Types of UMA UMA using bus-based SMP architectures UMA using crossbar switches UMA using multistage switching networks Example: UMA
Non-Uniform Memory Access It is a computer memory design used in multiprocessors, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non-local memory, that is, memory local to another processor or memory shared between processors.
Cache Coherence NUMA The system runs only one OS and shows only a single memory image to the user even though the memory is physically distributed over processors. Single processors can access their own memory much faster than that of other processors, the memory access is nonuniform. CC-NUMA
Vector Processing It is a CPU design where the instruction set includes operations that can perform mathematical operations on multiple data elements simultaneously. This is in contrast to scalar processor which handles one element at a time using multiple instructions. Examples and Applications Radar and Signal processing for detection of space/underwater targets. Remote sensing for earth resource exploration. Computational wind tunnel experiments. 3D stop action computer assisted tomography. Weather forecasting Medical diagnosis
Vector Processing Approach Instead of pipelining just the instructions, they also pipeline the data itself. They are fed instructions that say not just to add A to B, but Illustrations Programming language Execute this loop for 10 times Read the next instruction and decode it Fetch first number Fetch second number Add them Put the result here End loop Vector Processing Read instructions and decode it. Fetch 10 numbers Fetch 10 numbers Add them Put the results here
Vector computations Pipelined ALU Parallel ALU Parallel Processors Pipelined ALU
Bus Arbitration The device that is allowed to initiate data transfers on the bus at any given time is called bus master. There may be more than one bus master such as processor, DMA controller etc. They share the system bus. When the current master relinquishes control of the bus, another master acquire control of bus. Bus arbitration is the process by which the next device to become the bus master is selected and bus mastership is transferred to it. the selection of bus master is usually done on the priority basis. Centralized arbitration A single bus arbiter performs the required arbitration. The bus arbiter may be the processor or a separate controller connected to the bus. Methods: Daisy chaining Polling Independent request
Daisy chaining Polling
Independent request References Computer Architecture and Organization By A. P. Godse (from books.google.com ) Computer Organization By Hamacher and Zaky Computer Organization and Architecture By William Stallings