Chapter 3 : Topology basics

Size: px
Start display at page:

Download "Chapter 3 : Topology basics"

Transcription

1 1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000

2 2 Network topology (1) It corresponds to the static arrangement of channel and nodes in an interconnection network Topology selection is the first step in the design of a network It specifies both the type of network and the associated details Selection of a good topology consists in fitting the requirements in the available packaging technology Design depends on the number of ports and duty factor of ports But also on the pins available per chip and board, wire density, signaling rate andlength of cables The choice is based on cost and performance Performance can be evaluated considering throughput and latency Cost is based on the number and complexity of the chips, as much as density and length of interconnections used

3 3 Network topology (2) The choice cannot be based only on the data communication model of the problem It seems a good choice, but generally a special purpose network is a bad idea The load is poorly balanced, because of dynamic load imbalance and or mismatch between between problem size or machine size If data and threads are modified to balance load, the initial match is load The available packaging doesn t allow implementation of such networks The network is inflexible If the algorithm changes, the network cannot be modified as well Some examples (Fig.3.1)

4 4 Nomenclature (1) Nodes and channels N* set of nodes and N set of terminal nodes, with C set of channels: * Channel c = ( x, y) C where x, y N sc source node and dc destination node w c channel width f c channel frequency l c physical length t c latency; in general l where v is the propagation c = vtc velocity b c bc = wc fc channel bandwidth; it is Switch node x: * N N Cx = CIx COx Channel δset: x = C x Degree: δ Ix + δox It can be expressed as the sum, the in δ and out degree If it is the same for each node, it is indicated as

5 5 Nomenclature (2) Direct and indirect networks In direct network, every node is both a terminal and a switch (Fig.3.1a) Packets are forwarded directly between terminal nodes The resources of a terminal are available to each switch In indirect network, a node is either a terminal or a switch (Fig.3.1b) Packets are forwarded indirectly using dedicated switch nodes Every direct network can be represented as indirect, by splitting each node into a terminal and a switch (Fig.3.2)

6 6 Nomenclature (3) Cuts Set of channels that partitions the set of all nodes into two disjoint sets N1 and N2 Each channel of the cut connects a node from N1 to a node from N2 The total bandwidth of the cut is B( N1, N2) = bc Bisections It s a cut that divides the entire network nearly in a half The channel bisection is indicated as The bisection bandwidth is indicated as B C B B c C ( N, N ) If the network has a uniform channel bandwidth b, 1 2 (Sec.3.1.3) (Sec.3.1.3) B B = bb C

7 7 Nomenclature (4) Paths A path (or route) is an ordered set of channels P, where the destination node of a channel in the set correspond to the source of the following one If, for a particular network, at least one path exists between all source-destination pairs, the network is connected A minimal path from x to y is the path with the minimal hop count connecting the two nodes The set of all minimal paths is denoted R xy The hop count of a minimal path is H ( x, y) Diameter is the largest minimal hop count over all pairs H max It is bounded for a fully-connected network (eq.3.1) Average minimum hop count is H min and it is defined as the average hop count over all sources and destinations (sec.3.1.4) The phisical distance of a path is D( P) = lcand delay is t( P) = D( P) / v c P

8 8 Nomenclature (5) Symmetry A network is vertex-simmetric if there exists an automorphism that maps any node a into another node b Basicly the topology looks the same from the point-of-view of all the nodes This can simplify routing A network is edge-simmetric if there exists an automorphism that maps any channel a into another channel b It improves the load balance

9 9 Traffic patterns (1) Spatial distribution of messages in the interconnections networks Traffic matrix Λ : each matrix element λ s, d gives the fraction of traffic sent from s to d Common static traffic patterns (Tab.3.1) Random traffic Each source s is equally likely to send to each destination It balances load even for topologies and routing algorithms with very poor load balance Permutation traffic Each source s sends all its traffic to a single destination Permutations stress the load balance of a topology and a routing algorithm

10 10 Traffic patterns(2) Bit permutations The destination address is computed by permuting and selectively complementing the bits of the source address Digit permutations The digits of the destination address are calculated from the digits of the source address (they apply only to networks in which the terminal addresses can be expressed as n-digit)

11 11 Performance and cost To select a topology we base our choice on performance and cost Performance can be evaluated considering: Throughput and maximum channel load Latency Path diversity Cost of a topology is based on the sum of all constrains that derive from the used packaging technology

12 12 Throughput The throughput is the data rate in bits per second that the network accept per input port It depends on routing and flow control as much as on the topology The ideal throughput can be evaluated comsidering a perfect flow control and a balanced routing We often refer to the ideal throughput of a network on uniform traffic as capacity Maximum throughput occurs when some channel becomes saturated To calculate the throughput it s needed to consider the channel load

13 13 Channel load It s the ratio of the bandwidth demanded from channel c to the bandwidth of the input ports Maximum channel load is the load of the channel that carries the largest fraction of the traffic for a specific traffic pattern If the traffic reaches the throughput of the network, the load will be equal to the channel bandwidth Any additional traffic overload the channel The ideal throughput of a topology is expressed in (eq.3.2) Maximum channel load and throughput can be computed solving a multicommodity problem In case of uniform traffic, it s possible to calculate some upper and lower bounds

14 Throughput upper bound in a uniform traffic pattern 14 The load on the bisection channels gives a lower bound to the maximum channel load, and an upper bound on throughput For uniform traffic, N/2 packets must cross the bisection channels Bc As consequence,the load on each bisection is at least equals to the equation (eq.3.3) This gives an upper bound to the throughput (eq.3.4) For example in a k node ring, Bc=4 and the ideal throughput is equal to 8b/k

15 Channel load bounds in a uniform traffic pattern 15 A channel load lower bound can be computed in this manner Hmin*N gives the channel demand for a given traffic patterns Dividing this demand by the number of channel bounds the load (eq.3.5) These lower bounds can be complemented with a simple upper bound by considering a balanced routing function If there are Rxy minimal path, 1/Rxy is loaded on each channel of each minimal path Th maximum load is mathematically defined in (eq.3.6) γ γ γ For any topology, max, LB max max, UB For an edge-simmetric topology, both the bounds correspond to the maximum one

16 Example of ideal throughput estimation in an eight node ring network 16 Topology description of the considered network (Fig.3.3) Application of the upper bound approach to the channel (3,4) Considering figure 3.3: Dotted lines represent paths that count as half There are 6 solid lines and 4 dotted lines The maximum channel load is equal to 1 The use of the lower bound gives the same result Hmin*N/C=2*8/16=1 In the general case, an optimal distribution that minimized the channel load should be computed The solution calculation is beyond the scope of this book It s enough to describe the problem formulation

17 17 Formulation of the mathematical problem For each destination, vector xd defines the average distribution of packets over the channels A valid distribution is obtained by adding flow balance equations at each node The sum of the incoming distributions minus the sum over the outgoing channel must equal the average number that the node is sourcing (+) or sinking (-) In case of a distribution under uniform traffic, all terminal nodes source 1/N units and the destination 1 units It is represented using the element balance vector fd (eq.3.7) The topology can be expressed with the matrix A (eq.3.8) The optimization problem is written in (eq.3.9) Modifying (eq.3.6) and (eq.3.9), it is possible to generalize the problem to an arbitrary traffic pattern

18 18 Latency Defined as the time required for a packet to traverse the network It can be divided into two components: The head latency is the time required for the head of the message to traverse the network The serialization latency is the time required for the tail to catch up (eq.3.10) It depends on the topology, the routing, the flow control and also the design of the router We will focus on the contribution of the topology

19 Dependency of the latency on the topology choice 19 When no contention occurs, head latency depends on two factors connected with the topology: The router delay, that is the time spent in the routers The flight delay, that is the time spent on the wires The average router delay is Hmin*tr, while the average flight delay is Dmin/v The resultant expression for the zero-load latency is in (eq.3.11) In case of contention, an additional term Tc has to be added, considering the time spent waiting for resources Hmin, Dmin and b (eq.3.11) depend most on topology (but also on packaging)

20 20 Examples Packet propagating on a two-hop route from node x to node z, via node y (Fig.3.4) First row: each phit of the packet arriving at node x Second row: leaving x (routing delay tr) Third row: arriving at y (link latency txy) Fourth row: leaving y (second routing delay tr) Fifth row: arriving z (link latency tyz) At this head latency the serialization latency should be added (L/b) 64-node network with Havg=4 hops and 16-bit wide channel The frequency fc =1GHz, tc=5ns and tr=8ns Total routing delay 32ns (8*4) Total wire delay is 20ns (5*4) If L=64bytes, and b=2gbytes/s, serialization delay is equal to 32ns Total latency is 84ns

21 21 Path diversity A network with multiple routes between most pairs of node is more robust than a network with only a single route This property is called path diversity It improves the balance of the channel load and the fault tolerance Path diversity can be described considering a network with arbitrary permutation traffic Arbitrary permutation traffic is more challenging than uniform Without path diversity, traffic could be focused on a single bottleneck channel Path diversity allows to handle faults It is critical for large networks to tolerate faulty nodes or links One measure of the network fault tolerance is number of edgedisjoint or node-disjoint paths between two nodes But if a fault affect all the neighbors of a node, there is no solution Network isn t connected anymore

22 22 Example Bit permutation traffic: all nodes send a packet to the destination with bit permuted address Sequence is {0,2,4,6,8,10,12,14,1,3,5,7,9,11,13,15} Behavior of a 2-ary 4-fly butterfly (fig.3.5) All the packets from 0,1,8,9 traverse channel 10,20 Same situation for others node Channel load is equal to 4 and the throughput is 25% of the capacity Behavior of a 4-ary 2-cube network (fig.3.6) 2 routes traverse no channel, 4 routes one channel, 4 routes two channels, 4 routes three channels and 4 routes four channel The one-hop channel is the bottleneck For this network the throughput is 50% of capacity But if the 4 one-hop routes use also non minimal path, the traffic is spread uniformly The resulting throughput can reach the 89% of capacity

23 23 Packaging costs During the construction of a network, nodes of a topology are mapped to packaging modules (chips, boards, chassis) Topology and packaging generate some constraints on the channels bandwidth, that can be used to compare different topologies We consider as example a two-level packaging hierarchy We indicate the channel width as w We fix as constraint both the number of pins per node W s and the amount of global wiring We will discuss how channel frequency is affected by the topology and packaging choice W n

24 Constraints on a two-level packaging hierarchy: channel width (1) 24 At the first level, individual routers are connected by local wiring Local wiring is unexpensive and abundant For an example, see Figure 3.7 In case of an efficient local arrangement of nodes, constraint on channel width depends only on the available number of pins Wn In particular w δ The second level connects block via global wiring For an example, see Figure 3.8 The number of available global wires bounds the width of individual channels It is a good idea to use the bisection as partitioning of nodes in local group Ws Using minimum bisection, the constraint is expressed by w Bc

25 Constraints on a two-level packaging hierarchy: channel width (2) 25 Combining the two expression we obtain equation 3.14 Networks with low degree are constrained by the first term Generally they are node-pin limited Networks with high degree are constrained by the second term It is possible to express the the constraint in term of bandwidth Equation 3.14 can be rewrited

26 Constraints on a two-level packaging hierarchy: wire length 26 In addition to the width of available wires, it is needed to consider the length It should be kept short because the frequency falls quadratically The critical length is related to the maximum frequency dependent attenuation tolerated by the system Table 3.2 shows the length of common types of wires at a 2GHz rate The length can be increased inserting repeaters Actually the repeater cost is the same as a switch cost It is suggested to respect the minimum channel length and insert switch on the longest routes It is impratical to build electrical networks using topologies that require long channel It is more convenient to use optical signaling, but more expensive

27 27 Example Comparison between two six-nodes rings (Fig. 3.9) A simple ring with degree equal to 4 and Bc equal to 4 A Cayley graph with degree 6 and Bc 10 The maximum pin number is 140 and the global wiring is 200 signals wide Applying the previous equation we obtain equation 3.15 for the first network and equation 3.16 for the second one The results for a signal frequency of 1GHz are in table 3.3 Cayley graph has better throughput, but the ring has lower zero-load latency Cayley graph take advantage from the full bisection width, but his higher degree limits the size of an individual channel, increasing the serialization latency Counterintuitive result, seen that the Cayley ring has a lower hop count

28 28 Case study: SGI Origin 2000 (1) It supports up to 512 nodes with 2 MIPS R10000 each Its network is based on the SGI SPIDER routing chip 6 bidirectional network channels Each channel is 20 bit wide and operates at 400MHz Channel bandwidth is 6.4Gbits/s and total node bandwith 38.4Gbits/s The channels should be driven across a backplane and three of them can drive up to 5 meters of cable Figure 3.11 illustrates the modification of topology due to the increasing of nodes number Every processing node is connected to a router using 2 of the 6 available channels and leaving four channels Systems with up to 16 routers are configured as binary n-cubes If there are unused channels, they can be connected across the machine to reduce network diameter

29 29 Case study: SGI Origin 2000 (2) Figure 3.12 shows the topology in case there are more than 16 routers It is a hierarchical approach: 8-routers local subnetworks configured as binary 3-cube networks 8 routers-only global subnetworks use to connect the local ones together A maximal 256-router configuration uses 8 32-nodes binary 5-cube for global interconnections The Origin 200 is packaged in a hierarchy of boards, modules and racks, as shown in figure 3.13 Each node is packaged on a single board, each router is packaged on a different board 4 node boards and 2 routers boards are packaged in a chassis and connected by a midplane 2 chassis are place in each cabinet 64 is the maximum number of cabinets for a system (256 routers)

30 30 Case study: SGI Origin 2000 (3) Table 3.4 shows the performance of Origin according to the number of nodes Zero-load latency grows with average hop count and distance They depend on the diameter and serialization latency Serialization latency is fixed by the 20-bit width To keep latency low, Origin has a topology in which diameter and hop count increase with the machine size The hierarchical topology allow to keep the logarithmic grow of diameter and hop count in configuration with more than 16 routers The Origin topology provides a flat bisection bandwidth per node Bisection cuts N channel, with N equal to the router number 2 n For small machines, routers and channels For large machines, each node has a channel to a global subnet and each global subnet has a bisection bandwidth equal to the input bandwidth

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Topology basics. Constraints and measures. Butterfly networks.

Topology basics. Constraints and measures. Butterfly networks. EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

Homework Assignment #1: Topology Kelly Shaw

Homework Assignment #1: Topology Kelly Shaw EE482 Advanced Computer Organization Spring 2001 Professor W. J. Dally Homework Assignment #1: Topology Kelly Shaw As we have not discussed routing or flow control yet, throughout this problem set assume

More information

Chapter 4 : Butterfly Networks

Chapter 4 : Butterfly Networks 1 Chapter 4 : Butterfly Networks Structure of a butterfly network Isomorphism Channel load and throughput Optimization Path diversity Case study: BBN network 2 Structure of a butterfly network A K-ary

More information

Lecture 3: Topology - II

Lecture 3: Topology - II ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

Lecture 2: Topology - I

Lecture 2: Topology - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and

More information

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running

More information

Interconnection networks

Interconnection networks Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

INTERCONNECTION networks are used in a variety of applications,

INTERCONNECTION networks are used in a variety of applications, 1 Randomized Throughput-Optimal Oblivious Routing for Torus Networs Rohit Sunam Ramanujam, Student Member, IEEE, and Bill Lin, Member, IEEE Abstract In this paper, we study the problem of optimal oblivious

More information

Multiprocessor Interconnection Networks- Part Three

Multiprocessor Interconnection Networks- Part Three Babylon University College of Information Technology Software Department Multiprocessor Interconnection Networks- Part Three By The k-ary n-cube Networks The k-ary n-cube network is a radix k cube with

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed

More information

InfiniBand SDR, DDR, and QDR Technology Guide

InfiniBand SDR, DDR, and QDR Technology Guide White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses

More information

The Impact of Optics on HPC System Interconnects

The Impact of Optics on HPC System Interconnects The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes

More information

The final publication is available at

The final publication is available at Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The

More information

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

SHARED MEMORY VS DISTRIBUTED MEMORY

SHARED MEMORY VS DISTRIBUTED MEMORY OVERVIEW Important Processor Organizations 3 SHARED MEMORY VS DISTRIBUTED MEMORY Classical parallel algorithms were discussed using the shared memory paradigm. In shared memory parallel platform processors

More information

Basic Switch Organization

Basic Switch Organization NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Parallel Computing Platforms

Parallel Computing Platforms Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics

More information

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection

More information

Chapter 7 Slicing and Dicing

Chapter 7 Slicing and Dicing 1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999

Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999 Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999 Group Members: Overview Tom Fountain (fountain@cs.stanford.edu) T.J. Giuli (giuli@cs.stanford.edu) Paul Lassa (lassa@relgyro.stanford.edu)

More information

CS 6143 COMPUTER ARCHITECTURE II SPRING 2014

CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 DUE : April 9, 2014 HOMEWORK IV READ : - Related portions of Chapter 5 and Appendces F and I of the Hennessy book - Related portions of Chapter 1, 4 and 6 of

More information

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks Jose Duato Abstract Second generation multicomputers use wormhole routing, allowing a very low channel set-up time and drastically reducing

More information

On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors

On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors Govindan Ravindran Newbridge Networks Corporation Kanata, ON K2K 2E6, Canada gravindr@newbridge.com Michael

More information

Interconnection Network

Interconnection Network Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics

More information

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2 Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99

More information

CS 204 Lecture Notes on Elementary Network Analysis

CS 204 Lecture Notes on Elementary Network Analysis CS 204 Lecture Notes on Elementary Network Analysis Mart Molle Department of Computer Science and Engineering University of California, Riverside CA 92521 mart@cs.ucr.edu October 18, 2006 1 First-Order

More information

CS 614 COMPUTER ARCHITECTURE II FALL 2005

CS 614 COMPUTER ARCHITECTURE II FALL 2005 CS 614 COMPUTER ARCHITECTURE II FALL 2005 DUE : November 23, 2005 HOMEWORK IV READ : i) Related portions of Chapters : 3, 10, 15, 17 and 18 of the Sima book and ii) Chapter 8 of the Hennessy book. ASSIGNMENT:

More information

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1 EE382C Lecture 1 Bill Dally 3/29/11 EE 382C - S11 - Lecture 1 1 Logistics Handouts Course policy sheet Course schedule Assignments Homework Research Paper Project Midterm EE 382C - S11 - Lecture 1 2 What

More information

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to

More information

COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS

COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS International Journal of Computer Engineering and Applications, Volume VII, Issue II, Part II, COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS Sanjukta

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Parallel Computer Architecture II

Parallel Computer Architecture II Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

Slim Fly: A Cost Effective Low-Diameter Network Topology

Slim Fly: A Cost Effective Low-Diameter Network Topology TORSTEN HOEFLER, MACIEJ BESTA Slim Fly: A Cost Effective Low-Diameter Network Topology Images belong to their creator! NETWORKS, LIMITS, AND DESIGN SPACE Networks cost 25-30% of a large supercomputer Hard

More information

Finding Worst-case Permutations for Oblivious Routing Algorithms

Finding Worst-case Permutations for Oblivious Routing Algorithms Stanford University Concurrent VLSI Architecture Memo 2 Stanford University Computer Systems Laboratory Finding Worst-case Permutations for Oblivious Routing Algorithms Brian Towles Abstract We present

More information

Deadlock and Livelock. Maurizio Palesi

Deadlock and Livelock. Maurizio Palesi Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

Local Area Network Overview

Local Area Network Overview Local Area Network Overview Chapter 15 CS420/520 Axel Krings Page 1 LAN Applications (1) Personal computer LANs Low cost Limited data rate Back end networks Interconnecting large systems (mainframes and

More information

Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami

Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami Dept. Electrical & Computer Eng. Univ. of California, Santa Barbara Parallel Computer Architecture

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:

More information

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N. Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Multiconfiguration Multihop Protocols: A New Class of Protocols for Packet-Switched WDM Optical Networks

Multiconfiguration Multihop Protocols: A New Class of Protocols for Packet-Switched WDM Optical Networks Multiconfiguration Multihop Protocols: A New Class of Protocols for Packet-Switched WDM Optical Networks Jason P. Jue, Member, IEEE, and Biswanath Mukherjee, Member, IEEE Abstract Wavelength-division multiplexing

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

EE382 Processor Design. Illinois

EE382 Processor Design. Illinois EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate

More information

Lecture 7: Flow Control - I

Lecture 7: Flow Control - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 7: Flow Control - I Tushar Krishna Assistant Professor School of Electrical

More information

Physical Organization of Parallel Platforms. Alexandre David

Physical Organization of Parallel Platforms. Alexandre David Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:

More information

Data Communication and Parallel Computing on Twisted Hypercubes

Data Communication and Parallel Computing on Twisted Hypercubes Data Communication and Parallel Computing on Twisted Hypercubes E. Abuelrub, Department of Computer Science, Zarqa Private University, Jordan Abstract- Massively parallel distributed-memory architectures

More information

CS575 Parallel Processing

CS575 Parallel Processing CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.

More information

Chapter 4 NETWORK HARDWARE

Chapter 4 NETWORK HARDWARE Chapter 4 NETWORK HARDWARE 1 Network Devices As Organizations grow, so do their networks Growth in number of users Geographical Growth Network Devices : Are products used to expand or connect networks.

More information

Multiprocessor Interconnection Networks

Multiprocessor Interconnection Networks Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical

More information

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue

More information

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized

More information

Multicomputer distributed system LECTURE 8

Multicomputer distributed system LECTURE 8 Multicomputer distributed system LECTURE 8 DR. SAMMAN H. AMEEN 1 Wide area network (WAN); A WAN connects a large number of computers that are spread over large geographic distances. It can span sites in

More information

Worst-case Ethernet Network Latency for Shaped Sources

Worst-case Ethernet Network Latency for Shaped Sources Worst-case Ethernet Network Latency for Shaped Sources Max Azarov, SMSC 7th October 2005 Contents For 802.3 ResE study group 1 Worst-case latency theorem 1 1.1 Assumptions.............................

More information

Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1]

Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1] Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1] Presenter: Yongcheng (Jeremy) Li PhD student, School of Electronic and Information Engineering, Soochow University, China

More information

CH : 15 LOCAL AREA NETWORK OVERVIEW

CH : 15 LOCAL AREA NETWORK OVERVIEW CH : 15 LOCAL AREA NETWORK OVERVIEW P. 447 LAN (Local Area Network) A LAN consists of a shared transmission medium and a set of hardware and software for interfacing devices to the medium and regulating

More information

More on LANS. LAN Wiring, Interface

More on LANS. LAN Wiring, Interface More on LANS Chapters 10-11 LAN Wiring, Interface Mostly covered this material already NIC = Network Interface Card Separate processor, buffers incoming/outgoing data CPU might not be able to keep up network

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

3. Evaluation of Selected Tree and Mesh based Routing Protocols

3. Evaluation of Selected Tree and Mesh based Routing Protocols 33 3. Evaluation of Selected Tree and Mesh based Routing Protocols 3.1 Introduction Construction of best possible multicast trees and maintaining the group connections in sequence is challenging even in

More information

CSC630/CSC730: Parallel Computing

CSC630/CSC730: Parallel Computing CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control

More information

From Routing to Traffic Engineering

From Routing to Traffic Engineering 1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers

More information

VIII. Communication costs, routing mechanism, mapping techniques, cost-performance tradeoffs. April 6 th, 2009

VIII. Communication costs, routing mechanism, mapping techniques, cost-performance tradeoffs. April 6 th, 2009 VIII. Communication costs, routing mechanism, mapping techniques, cost-performance tradeoffs April 6 th, 2009 Message Passing Costs Major overheads in the execution of parallel programs: from communication

More information

Chapter 06 IP Address

Chapter 06 IP Address Chapter 06 IP Address IP Address Internet address Identifier used at IP layer 32 bit binary address The address space of IPv4 is 2 32 or 4,294,967,296 Consists of netid and hosted IP Address Structure

More information

Linux System Administration

Linux System Administration IP Addressing Subnetting Objective At the conclusion of this module, the student will be able to: Describe how packets are routed from one network to another Describe the parts and classes of IPv4 address

More information

Optical Loss Budgets

Optical Loss Budgets CHAPTER 4 The optical loss budget is an important aspect in designing networks with the Cisco ONS 15540. The optical loss budget is the ultimate limiting factor in distances between nodes in a topology.

More information

Lecture 3: Sorting 1

Lecture 3: Sorting 1 Lecture 3: Sorting 1 Sorting Arranging an unordered collection of elements into monotonically increasing (or decreasing) order. S = a sequence of n elements in arbitrary order After sorting:

More information

Performance Evaluation of Probe-Send Fault-tolerant Network-on-chip Router

Performance Evaluation of Probe-Send Fault-tolerant Network-on-chip Router erformance Evaluation of robe-send Fault-tolerant Network-on-chip Router Sumit Dharampal Mediratta 1, Jeffrey Draper 2 1 NVIDIA Graphics vt Ltd, 2 SC Information Sciences Institute 1 Bangalore, India-560001,

More information

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation

More information

Estimation of Wirelength

Estimation of Wirelength Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate

More information

Sorting is ordering a list of objects. Here are some sorting algorithms

Sorting is ordering a list of objects. Here are some sorting algorithms Sorting Sorting is ordering a list of objects. Here are some sorting algorithms Bubble sort Insertion sort Selection sort Mergesort Question: What is the lower bound for all sorting algorithms? Algorithms

More information

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Nishant Satya Lakshmikanth sailtosatya@gmail.com Krishna Kumaar N.I. nikrishnaa@gmail.com Sudha S

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

Hardware Evolution in Data Centers

Hardware Evolution in Data Centers Hardware Evolution in Data Centers 2004 2008 2011 2000 2013 2014 Trend towards customization Increase work done per dollar (CapEx + OpEx) Paolo Costa Rethinking the Network Stack for Rack-scale Computers

More information

ET4254 Communications and Networking 1

ET4254 Communications and Networking 1 Topic 10:- Local Area Network Overview Aims:- LAN topologies and media LAN protocol architecture bridges, hubs, layer 2 & 3 switches 1 LAN Applications (1) personal computer LANs low cost limited data

More information

Design of Parallel Algorithms. The Architecture of a Parallel Computer

Design of Parallel Algorithms. The Architecture of a Parallel Computer + Design of Parallel Algorithms The Architecture of a Parallel Computer + Trends in Microprocessor Architectures n Microprocessor clock speeds are no longer increasing and have reached a limit of 3-4 Ghz

More information

Interconnection Networks

Interconnection Networks Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact

More information

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:

More information

Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks

Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks 2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam

More information

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed

More information

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing Stefan Schmid @ T-Labs, 2011 GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing If you

More information

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview

More information

Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture

Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture Wei Shi and Pradip K Srimani Department of Computer Science Colorado State University Ft. Collins, CO 80523 Abstract Bounded degree

More information

Parallel Architecture. Sathish Vadhiyar

Parallel Architecture. Sathish Vadhiyar Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information