CSC630/CSC730: Parallel Computing

CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control structure Communication model Physical organization (actual hardware) Interconnection networks Network topologies Characteristics 2 1

Interconnection Networks There are two main types of interconnection networks: Static networks and dynamic networks Dr. Joe Zhang PDC-4: Topology Static Networks Also called direct networks Each vertex corresponds to a node. Has point-to-point communication links No switches at vertices in static networks. If there is no direct connection between two nodes, intermediate nodes would have to forward communication between them. Static networks can be arranged as a linear array, a ring, hypercube, 2d mesh, 3d mesh, and 2d torus, in increasing order of connectivity. Examples The Intel Paragon: a 2D mesh The Cray T3E: a 3d torus. Both scale to thousands of nodes. 4 2

Dynamic Networks Also called indirect networks Some vertices correspond to switches that route communications. A crossbar switch would be optimal but very expensive. Most switches are multistage Examples are omega networks. 5 Network Topology - Bus 6 3

Bus The cost of network scales linearly, O(p) The distance between any two nodes: O(1) Ideal for broadcasting information among nodes The bounded bandwidth affects the performance Reduce demand on bus bandwidth Provide cache for each node Cache private data Only access remote data through bus Scalable in terms of cost but unscalable in terms of performance 7 Network Topology - Crossbar 8 4

Crossbar A non-blocking network Total number of switches: Q(pb) Assume that b is at least p. (reasonable?) As p increase, the complexity grows as Ω(p 2 ) Scalable in terms of performance but unscalable in terms of cost 9 Network Topology - Multistage 10 5

Multistage Network -- Omega An intermediate class of networks More scalable than the bus in terms of performance More scalable than the crossbar in terms of cost A common used Omega network p processing nodes b memory banks (b=p) log p stages A link exist between input i and output j 11 Interconnection Pattern (Omega) Left rotation of binary representation of i and j 2 i, 0 i p / 2 1 j 2i 1 p, p / 2 i p 1 12 6

Omega Network Switching nodes: p/ 2 log p Cost of network: ( plog p) Routing data in an Omega network: Let s be binary representation of a processor that needs to write some data into memory bank t First stage: if the most significant bits of s and t are the same, data is routed in pass-through mode If they are different, the data is routed in cross-over mode Repeated at next stage using the next most significant bit. 13 Blocking in Omega Network 14 7

Completely-Connected Network Star-Connected Network Desirable but impractical 15 Completely Connected Network Completely connected network: each node has a link to every other node. N nodes could have n-1 links from each node to other n-1 nodes. Therefore, there should be n(n-1)/2 links in all. It is applied to small n. not practical to large n 16 8

Linear Array Line/Ring: each node has two links and link only to neighboring node N-node ring requires n links Two end node are farthest away in a line and hence the diameter is n-1 17 2D and 3D Meshes N=16 Links 21 Diameter 2*(sqrt(16)-1)=6 N=16 Links 32 Diameter 4 Naturally map a regularly structured computation to 2D or 3D mesh. 3D Cube used in Cray T3E 18 9

Hypercube 19 Hypercube Construct a cube with p nodes from two subcubes of p/2 nodes Numbering scheme for nodes in a hypercube Derived from the construction of a hypercube Prefixing the labels of one of the subcubes with a 0 and the labels of the other subcube with a 1. Useful property The minimum distance between two nodes is given by the number of bites that are different in the two labels. Nodes labeled 0110 and 0101 are two link apart Useful for deriving a number of parallel algorithms 20 10

Tree-Based Network 21 Tree-Based Network Tree Network: binary network or hierarchy tree network; each node has two links that link to two nodes. Total nodes with j levels: 2 j+1-1 root level: one node First level: two nodes Second level: four nodes jth level: 2 j nodes CM5 system deploys such architecture 22 11

Cost and Performance of Static Network Network Criteria Diameter The maximum distance between any two processing nodes in the network Distance between two processing nodes is defined as the shortest path (in terms of number of links) between them Connectivity A measure of the multiplicity of paths between any two processing nodes High connectivity is desirable Reduce contention Arc connectivity The minimum number of arcs that must be removed from the network to break it into two disconnected networks 23 Cost and Performance Network Criteria Bisection width The minimum number of communication links that must be removed to partition the network into two equal halves Bisection width of a completely connected network: p 2 /4 Bisection bandwidth The minimum volume of communication allowed between any two halves of the network Cost Number of communication links 24 12

Characteristics of Static Networks 25 Summary Interconnection network Static and dynamic networks Network topology Characteristics 26 13

CSC630/CSC730: Parallel Computing Questions? Dr. Joe Zhang PDC-4: Topology 27 14