Chapter 3 : Topology basics
|
|
- Clara Bruce
- 6 years ago
- Views:
Transcription
1 1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000
2 2 Network topology (1) It corresponds to the static arrangement of channel and nodes in an interconnection network Topology selection is the first step in the design of a network It specifies both the type of network and the associated details Selection of a good topology consists in fitting the requirements in the available packaging technology Design depends on the number of ports and duty factor of ports But also on the pins available per chip and board, wire density, signaling rate andlength of cables The choice is based on cost and performance Performance can be evaluated considering throughput and latency Cost is based on the number and complexity of the chips, as much as density and length of interconnections used
3 3 Network topology (2) The choice cannot be based only on the data communication model of the problem It seems a good choice, but generally a special purpose network is a bad idea The load is poorly balanced, because of dynamic load imbalance and or mismatch between between problem size or machine size If data and threads are modified to balance load, the initial match is load The available packaging doesn t allow implementation of such networks The network is inflexible If the algorithm changes, the network cannot be modified as well Some examples (Fig.3.1)
4 4 Nomenclature (1) Nodes and channels N* set of nodes and N set of terminal nodes, with C set of channels: * Channel c = ( x, y) C where x, y N sc source node and dc destination node w c channel width f c channel frequency l c physical length t c latency; in general l where v is the propagation c = vtc velocity b c bc = wc fc channel bandwidth; it is Switch node x: * N N Cx = CIx COx Channel δset: x = C x Degree: δ Ix + δox It can be expressed as the sum, the in δ and out degree If it is the same for each node, it is indicated as
5 5 Nomenclature (2) Direct and indirect networks In direct network, every node is both a terminal and a switch (Fig.3.1a) Packets are forwarded directly between terminal nodes The resources of a terminal are available to each switch In indirect network, a node is either a terminal or a switch (Fig.3.1b) Packets are forwarded indirectly using dedicated switch nodes Every direct network can be represented as indirect, by splitting each node into a terminal and a switch (Fig.3.2)
6 6 Nomenclature (3) Cuts Set of channels that partitions the set of all nodes into two disjoint sets N1 and N2 Each channel of the cut connects a node from N1 to a node from N2 The total bandwidth of the cut is B( N1, N2) = bc Bisections It s a cut that divides the entire network nearly in a half The channel bisection is indicated as The bisection bandwidth is indicated as B C B B c C ( N, N ) If the network has a uniform channel bandwidth b, 1 2 (Sec.3.1.3) (Sec.3.1.3) B B = bb C
7 7 Nomenclature (4) Paths A path (or route) is an ordered set of channels P, where the destination node of a channel in the set correspond to the source of the following one If, for a particular network, at least one path exists between all source-destination pairs, the network is connected A minimal path from x to y is the path with the minimal hop count connecting the two nodes The set of all minimal paths is denoted R xy The hop count of a minimal path is H ( x, y) Diameter is the largest minimal hop count over all pairs H max It is bounded for a fully-connected network (eq.3.1) Average minimum hop count is H min and it is defined as the average hop count over all sources and destinations (sec.3.1.4) The phisical distance of a path is D( P) = lcand delay is t( P) = D( P) / v c P
8 8 Nomenclature (5) Symmetry A network is vertex-simmetric if there exists an automorphism that maps any node a into another node b Basicly the topology looks the same from the point-of-view of all the nodes This can simplify routing A network is edge-simmetric if there exists an automorphism that maps any channel a into another channel b It improves the load balance
9 9 Traffic patterns (1) Spatial distribution of messages in the interconnections networks Traffic matrix Λ : each matrix element λ s, d gives the fraction of traffic sent from s to d Common static traffic patterns (Tab.3.1) Random traffic Each source s is equally likely to send to each destination It balances load even for topologies and routing algorithms with very poor load balance Permutation traffic Each source s sends all its traffic to a single destination Permutations stress the load balance of a topology and a routing algorithm
10 10 Traffic patterns(2) Bit permutations The destination address is computed by permuting and selectively complementing the bits of the source address Digit permutations The digits of the destination address are calculated from the digits of the source address (they apply only to networks in which the terminal addresses can be expressed as n-digit)
11 11 Performance and cost To select a topology we base our choice on performance and cost Performance can be evaluated considering: Throughput and maximum channel load Latency Path diversity Cost of a topology is based on the sum of all constrains that derive from the used packaging technology
12 12 Throughput The throughput is the data rate in bits per second that the network accept per input port It depends on routing and flow control as much as on the topology The ideal throughput can be evaluated comsidering a perfect flow control and a balanced routing We often refer to the ideal throughput of a network on uniform traffic as capacity Maximum throughput occurs when some channel becomes saturated To calculate the throughput it s needed to consider the channel load
13 13 Channel load It s the ratio of the bandwidth demanded from channel c to the bandwidth of the input ports Maximum channel load is the load of the channel that carries the largest fraction of the traffic for a specific traffic pattern If the traffic reaches the throughput of the network, the load will be equal to the channel bandwidth Any additional traffic overload the channel The ideal throughput of a topology is expressed in (eq.3.2) Maximum channel load and throughput can be computed solving a multicommodity problem In case of uniform traffic, it s possible to calculate some upper and lower bounds
14 Throughput upper bound in a uniform traffic pattern 14 The load on the bisection channels gives a lower bound to the maximum channel load, and an upper bound on throughput For uniform traffic, N/2 packets must cross the bisection channels Bc As consequence,the load on each bisection is at least equals to the equation (eq.3.3) This gives an upper bound to the throughput (eq.3.4) For example in a k node ring, Bc=4 and the ideal throughput is equal to 8b/k
15 Channel load bounds in a uniform traffic pattern 15 A channel load lower bound can be computed in this manner Hmin*N gives the channel demand for a given traffic patterns Dividing this demand by the number of channel bounds the load (eq.3.5) These lower bounds can be complemented with a simple upper bound by considering a balanced routing function If there are Rxy minimal path, 1/Rxy is loaded on each channel of each minimal path Th maximum load is mathematically defined in (eq.3.6) γ γ γ For any topology, max, LB max max, UB For an edge-simmetric topology, both the bounds correspond to the maximum one
16 Example of ideal throughput estimation in an eight node ring network 16 Topology description of the considered network (Fig.3.3) Application of the upper bound approach to the channel (3,4) Considering figure 3.3: Dotted lines represent paths that count as half There are 6 solid lines and 4 dotted lines The maximum channel load is equal to 1 The use of the lower bound gives the same result Hmin*N/C=2*8/16=1 In the general case, an optimal distribution that minimized the channel load should be computed The solution calculation is beyond the scope of this book It s enough to describe the problem formulation
17 17 Formulation of the mathematical problem For each destination, vector xd defines the average distribution of packets over the channels A valid distribution is obtained by adding flow balance equations at each node The sum of the incoming distributions minus the sum over the outgoing channel must equal the average number that the node is sourcing (+) or sinking (-) In case of a distribution under uniform traffic, all terminal nodes source 1/N units and the destination 1 units It is represented using the element balance vector fd (eq.3.7) The topology can be expressed with the matrix A (eq.3.8) The optimization problem is written in (eq.3.9) Modifying (eq.3.6) and (eq.3.9), it is possible to generalize the problem to an arbitrary traffic pattern
18 18 Latency Defined as the time required for a packet to traverse the network It can be divided into two components: The head latency is the time required for the head of the message to traverse the network The serialization latency is the time required for the tail to catch up (eq.3.10) It depends on the topology, the routing, the flow control and also the design of the router We will focus on the contribution of the topology
19 Dependency of the latency on the topology choice 19 When no contention occurs, head latency depends on two factors connected with the topology: The router delay, that is the time spent in the routers The flight delay, that is the time spent on the wires The average router delay is Hmin*tr, while the average flight delay is Dmin/v The resultant expression for the zero-load latency is in (eq.3.11) In case of contention, an additional term Tc has to be added, considering the time spent waiting for resources Hmin, Dmin and b (eq.3.11) depend most on topology (but also on packaging)
20 20 Examples Packet propagating on a two-hop route from node x to node z, via node y (Fig.3.4) First row: each phit of the packet arriving at node x Second row: leaving x (routing delay tr) Third row: arriving at y (link latency txy) Fourth row: leaving y (second routing delay tr) Fifth row: arriving z (link latency tyz) At this head latency the serialization latency should be added (L/b) 64-node network with Havg=4 hops and 16-bit wide channel The frequency fc =1GHz, tc=5ns and tr=8ns Total routing delay 32ns (8*4) Total wire delay is 20ns (5*4) If L=64bytes, and b=2gbytes/s, serialization delay is equal to 32ns Total latency is 84ns
21 21 Path diversity A network with multiple routes between most pairs of node is more robust than a network with only a single route This property is called path diversity It improves the balance of the channel load and the fault tolerance Path diversity can be described considering a network with arbitrary permutation traffic Arbitrary permutation traffic is more challenging than uniform Without path diversity, traffic could be focused on a single bottleneck channel Path diversity allows to handle faults It is critical for large networks to tolerate faulty nodes or links One measure of the network fault tolerance is number of edgedisjoint or node-disjoint paths between two nodes But if a fault affect all the neighbors of a node, there is no solution Network isn t connected anymore
22 22 Example Bit permutation traffic: all nodes send a packet to the destination with bit permuted address Sequence is {0,2,4,6,8,10,12,14,1,3,5,7,9,11,13,15} Behavior of a 2-ary 4-fly butterfly (fig.3.5) All the packets from 0,1,8,9 traverse channel 10,20 Same situation for others node Channel load is equal to 4 and the throughput is 25% of the capacity Behavior of a 4-ary 2-cube network (fig.3.6) 2 routes traverse no channel, 4 routes one channel, 4 routes two channels, 4 routes three channels and 4 routes four channel The one-hop channel is the bottleneck For this network the throughput is 50% of capacity But if the 4 one-hop routes use also non minimal path, the traffic is spread uniformly The resulting throughput can reach the 89% of capacity
23 23 Packaging costs During the construction of a network, nodes of a topology are mapped to packaging modules (chips, boards, chassis) Topology and packaging generate some constraints on the channels bandwidth, that can be used to compare different topologies We consider as example a two-level packaging hierarchy We indicate the channel width as w We fix as constraint both the number of pins per node W s and the amount of global wiring We will discuss how channel frequency is affected by the topology and packaging choice W n
24 Constraints on a two-level packaging hierarchy: channel width (1) 24 At the first level, individual routers are connected by local wiring Local wiring is unexpensive and abundant For an example, see Figure 3.7 In case of an efficient local arrangement of nodes, constraint on channel width depends only on the available number of pins Wn In particular w δ The second level connects block via global wiring For an example, see Figure 3.8 The number of available global wires bounds the width of individual channels It is a good idea to use the bisection as partitioning of nodes in local group Ws Using minimum bisection, the constraint is expressed by w Bc
25 Constraints on a two-level packaging hierarchy: channel width (2) 25 Combining the two expression we obtain equation 3.14 Networks with low degree are constrained by the first term Generally they are node-pin limited Networks with high degree are constrained by the second term It is possible to express the the constraint in term of bandwidth Equation 3.14 can be rewrited
26 Constraints on a two-level packaging hierarchy: wire length 26 In addition to the width of available wires, it is needed to consider the length It should be kept short because the frequency falls quadratically The critical length is related to the maximum frequency dependent attenuation tolerated by the system Table 3.2 shows the length of common types of wires at a 2GHz rate The length can be increased inserting repeaters Actually the repeater cost is the same as a switch cost It is suggested to respect the minimum channel length and insert switch on the longest routes It is impratical to build electrical networks using topologies that require long channel It is more convenient to use optical signaling, but more expensive
27 27 Example Comparison between two six-nodes rings (Fig. 3.9) A simple ring with degree equal to 4 and Bc equal to 4 A Cayley graph with degree 6 and Bc 10 The maximum pin number is 140 and the global wiring is 200 signals wide Applying the previous equation we obtain equation 3.15 for the first network and equation 3.16 for the second one The results for a signal frequency of 1GHz are in table 3.3 Cayley graph has better throughput, but the ring has lower zero-load latency Cayley graph take advantage from the full bisection width, but his higher degree limits the size of an individual channel, increasing the serialization latency Counterintuitive result, seen that the Cayley ring has a lower hop count
28 28 Case study: SGI Origin 2000 (1) It supports up to 512 nodes with 2 MIPS R10000 each Its network is based on the SGI SPIDER routing chip 6 bidirectional network channels Each channel is 20 bit wide and operates at 400MHz Channel bandwidth is 6.4Gbits/s and total node bandwith 38.4Gbits/s The channels should be driven across a backplane and three of them can drive up to 5 meters of cable Figure 3.11 illustrates the modification of topology due to the increasing of nodes number Every processing node is connected to a router using 2 of the 6 available channels and leaving four channels Systems with up to 16 routers are configured as binary n-cubes If there are unused channels, they can be connected across the machine to reduce network diameter
29 29 Case study: SGI Origin 2000 (2) Figure 3.12 shows the topology in case there are more than 16 routers It is a hierarchical approach: 8-routers local subnetworks configured as binary 3-cube networks 8 routers-only global subnetworks use to connect the local ones together A maximal 256-router configuration uses 8 32-nodes binary 5-cube for global interconnections The Origin 200 is packaged in a hierarchy of boards, modules and racks, as shown in figure 3.13 Each node is packaged on a single board, each router is packaged on a different board 4 node boards and 2 routers boards are packaged in a chassis and connected by a midplane 2 chassis are place in each cabinet 64 is the maximum number of cabinets for a system (256 routers)
30 30 Case study: SGI Origin 2000 (3) Table 3.4 shows the performance of Origin according to the number of nodes Zero-load latency grows with average hop count and distance They depend on the diameter and serialization latency Serialization latency is fixed by the 20-bit width To keep latency low, Origin has a topology in which diameter and hop count increase with the machine size The hierarchical topology allow to keep the logarithmic grow of diameter and hop count in configuration with more than 16 routers The Origin topology provides a flat bisection bandwidth per node Bisection cuts N channel, with N equal to the router number 2 n For small machines, routers and channels For large machines, each node has a channel to a global subnet and each global subnet has a bisection bandwidth equal to the input bandwidth
Topologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationHomework Assignment #1: Topology Kelly Shaw
EE482 Advanced Computer Organization Spring 2001 Professor W. J. Dally Homework Assignment #1: Topology Kelly Shaw As we have not discussed routing or flow control yet, throughout this problem set assume
More informationChapter 4 : Butterfly Networks
1 Chapter 4 : Butterfly Networks Structure of a butterfly network Isomorphism Channel load and throughput Optimization Path diversity Case study: BBN network 2 Structure of a butterfly network A K-ary
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More information4. Networks. in parallel computers. Advances in Computer Architecture
4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts
ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running
More informationInterconnection networks
Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory
More informationInterconnect Technology and Computational Speed
Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationLecture: Interconnection Networks
Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet
More informationINTERCONNECTION networks are used in a variety of applications,
1 Randomized Throughput-Optimal Oblivious Routing for Torus Networs Rohit Sunam Ramanujam, Student Member, IEEE, and Bill Lin, Member, IEEE Abstract In this paper, we study the problem of optimal oblivious
More informationMultiprocessor Interconnection Networks- Part Three
Babylon University College of Information Technology Software Department Multiprocessor Interconnection Networks- Part Three By The k-ary n-cube Networks The k-ary n-cube network is a radix k cube with
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationInfiniBand SDR, DDR, and QDR Technology Guide
White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses
More informationThe Impact of Optics on HPC System Interconnects
The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes
More informationThe final publication is available at
Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The
More informationOFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management
Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationModule 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth
Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012
More informationSHARED MEMORY VS DISTRIBUTED MEMORY
OVERVIEW Important Processor Organizations 3 SHARED MEMORY VS DISTRIBUTED MEMORY Classical parallel algorithms were discussed using the shared memory paradigm. In shared memory parallel platform processors
More informationBasic Switch Organization
NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationParallel Computing Platforms
Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics
More informationInterconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection
More informationChapter 7 Slicing and Dicing
1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More informationInterconnection Network Project EE482 Advanced Computer Organization May 28, 1999
Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999 Group Members: Overview Tom Fountain (fountain@cs.stanford.edu) T.J. Giuli (giuli@cs.stanford.edu) Paul Lassa (lassa@relgyro.stanford.edu)
More informationCS 6143 COMPUTER ARCHITECTURE II SPRING 2014
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 DUE : April 9, 2014 HOMEWORK IV READ : - Related portions of Chapter 5 and Appendces F and I of the Hennessy book - Related portions of Chapter 1, 4 and 6 of
More informationA New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks Jose Duato Abstract Second generation multicomputers use wormhole routing, allowing a very low channel set-up time and drastically reducing
More informationOn Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors
On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors Govindan Ravindran Newbridge Networks Corporation Kanata, ON K2K 2E6, Canada gravindr@newbridge.com Michael
More informationInterconnection Network
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics
More informationCS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2
Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99
More informationCS 204 Lecture Notes on Elementary Network Analysis
CS 204 Lecture Notes on Elementary Network Analysis Mart Molle Department of Computer Science and Engineering University of California, Riverside CA 92521 mart@cs.ucr.edu October 18, 2006 1 First-Order
More informationCS 614 COMPUTER ARCHITECTURE II FALL 2005
CS 614 COMPUTER ARCHITECTURE II FALL 2005 DUE : November 23, 2005 HOMEWORK IV READ : i) Related portions of Chapters : 3, 10, 15, 17 and 18 of the Sima book and ii) Chapter 8 of the Hennessy book. ASSIGNMENT:
More informationEE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1
EE382C Lecture 1 Bill Dally 3/29/11 EE 382C - S11 - Lecture 1 1 Logistics Handouts Course policy sheet Course schedule Assignments Homework Research Paper Project Midterm EE 382C - S11 - Lecture 1 2 What
More informationEN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University
EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to
More informationCOMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS
International Journal of Computer Engineering and Applications, Volume VII, Issue II, Part II, COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS Sanjukta
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationLecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance
Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,
More informationParallel Computer Architecture II
Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de
More informationRouting Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)
Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup
More informationSlim Fly: A Cost Effective Low-Diameter Network Topology
TORSTEN HOEFLER, MACIEJ BESTA Slim Fly: A Cost Effective Low-Diameter Network Topology Images belong to their creator! NETWORKS, LIMITS, AND DESIGN SPACE Networks cost 25-30% of a large supercomputer Hard
More informationFinding Worst-case Permutations for Oblivious Routing Algorithms
Stanford University Concurrent VLSI Architecture Memo 2 Stanford University Computer Systems Laboratory Finding Worst-case Permutations for Oblivious Routing Algorithms Brian Towles Abstract We present
More informationDeadlock and Livelock. Maurizio Palesi
Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationLocal Area Network Overview
Local Area Network Overview Chapter 15 CS420/520 Axel Krings Page 1 LAN Applications (1) Personal computer LANs Low cost Limited data rate Back end networks Interconnecting large systems (mainframes and
More informationNetwork Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami
Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami Dept. Electrical & Computer Eng. Univ. of California, Santa Barbara Parallel Computer Architecture
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:
More informationInterconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.
Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationMulticonfiguration Multihop Protocols: A New Class of Protocols for Packet-Switched WDM Optical Networks
Multiconfiguration Multihop Protocols: A New Class of Protocols for Packet-Switched WDM Optical Networks Jason P. Jue, Member, IEEE, and Biswanath Mukherjee, Member, IEEE Abstract Wavelength-division multiplexing
More informationPlace and Route for FPGAs
Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming
More informationEE382 Processor Design. Illinois
EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate
More informationLecture 7: Flow Control - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 7: Flow Control - I Tushar Krishna Assistant Professor School of Electrical
More informationPhysical Organization of Parallel Platforms. Alexandre David
Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:
More informationData Communication and Parallel Computing on Twisted Hypercubes
Data Communication and Parallel Computing on Twisted Hypercubes E. Abuelrub, Department of Computer Science, Zarqa Private University, Jordan Abstract- Massively parallel distributed-memory architectures
More informationCS575 Parallel Processing
CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.
More informationChapter 4 NETWORK HARDWARE
Chapter 4 NETWORK HARDWARE 1 Network Devices As Organizations grow, so do their networks Growth in number of users Geographical Growth Network Devices : Are products used to expand or connect networks.
More informationMultiprocessor Interconnection Networks
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical
More informationLecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue
More informationNetworks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized
More informationMulticomputer distributed system LECTURE 8
Multicomputer distributed system LECTURE 8 DR. SAMMAN H. AMEEN 1 Wide area network (WAN); A WAN connects a large number of computers that are spread over large geographic distances. It can span sites in
More informationWorst-case Ethernet Network Latency for Shaped Sources
Worst-case Ethernet Network Latency for Shaped Sources Max Azarov, SMSC 7th October 2005 Contents For 802.3 ResE study group 1 Worst-case latency theorem 1 1.1 Assumptions.............................
More informationPerformance Analysis of Storage-Based Routing for Circuit-Switched Networks [1]
Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1] Presenter: Yongcheng (Jeremy) Li PhD student, School of Electronic and Information Engineering, Soochow University, China
More informationCH : 15 LOCAL AREA NETWORK OVERVIEW
CH : 15 LOCAL AREA NETWORK OVERVIEW P. 447 LAN (Local Area Network) A LAN consists of a shared transmission medium and a set of hardware and software for interfacing devices to the medium and regulating
More informationMore on LANS. LAN Wiring, Interface
More on LANS Chapters 10-11 LAN Wiring, Interface Mostly covered this material already NIC = Network Interface Card Separate processor, buffers incoming/outgoing data CPU might not be able to keep up network
More informationLecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control
Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection
More information3. Evaluation of Selected Tree and Mesh based Routing Protocols
33 3. Evaluation of Selected Tree and Mesh based Routing Protocols 3.1 Introduction Construction of best possible multicast trees and maintaining the group connections in sequence is challenging even in
More informationCSC630/CSC730: Parallel Computing
CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control
More informationFrom Routing to Traffic Engineering
1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers
More informationVIII. Communication costs, routing mechanism, mapping techniques, cost-performance tradeoffs. April 6 th, 2009
VIII. Communication costs, routing mechanism, mapping techniques, cost-performance tradeoffs April 6 th, 2009 Message Passing Costs Major overheads in the execution of parallel programs: from communication
More informationChapter 06 IP Address
Chapter 06 IP Address IP Address Internet address Identifier used at IP layer 32 bit binary address The address space of IPv4 is 2 32 or 4,294,967,296 Consists of netid and hosted IP Address Structure
More informationLinux System Administration
IP Addressing Subnetting Objective At the conclusion of this module, the student will be able to: Describe how packets are routed from one network to another Describe the parts and classes of IPv4 address
More informationOptical Loss Budgets
CHAPTER 4 The optical loss budget is an important aspect in designing networks with the Cisco ONS 15540. The optical loss budget is the ultimate limiting factor in distances between nodes in a topology.
More informationLecture 3: Sorting 1
Lecture 3: Sorting 1 Sorting Arranging an unordered collection of elements into monotonically increasing (or decreasing) order. S = a sequence of n elements in arbitrary order After sorting:
More informationPerformance Evaluation of Probe-Send Fault-tolerant Network-on-chip Router
erformance Evaluation of robe-send Fault-tolerant Network-on-chip Router Sumit Dharampal Mediratta 1, Jeffrey Draper 2 1 NVIDIA Graphics vt Ltd, 2 SC Information Sciences Institute 1 Bangalore, India-560001,
More informationLecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background
Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation
More informationEstimation of Wirelength
Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate
More informationSorting is ordering a list of objects. Here are some sorting algorithms
Sorting Sorting is ordering a list of objects. Here are some sorting algorithms Bubble sort Insertion sort Selection sort Mergesort Question: What is the lower bound for all sorting algorithms? Algorithms
More informationDynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution
Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Nishant Satya Lakshmikanth sailtosatya@gmail.com Krishna Kumaar N.I. nikrishnaa@gmail.com Sudha S
More informationA Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup
A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington
More informationHardware Evolution in Data Centers
Hardware Evolution in Data Centers 2004 2008 2011 2000 2013 2014 Trend towards customization Increase work done per dollar (CapEx + OpEx) Paolo Costa Rethinking the Network Stack for Rack-scale Computers
More informationET4254 Communications and Networking 1
Topic 10:- Local Area Network Overview Aims:- LAN topologies and media LAN protocol architecture bridges, hubs, layer 2 & 3 switches 1 LAN Applications (1) personal computer LANs low cost limited data
More informationDesign of Parallel Algorithms. The Architecture of a Parallel Computer
+ Design of Parallel Algorithms The Architecture of a Parallel Computer + Trends in Microprocessor Architectures n Microprocessor clock speeds are no longer increasing and have reached a limit of 3-4 Ghz
More informationInterconnection Networks
Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact
More informationInterconnection Networks: Flow Control. Prof. Natalie Enright Jerger
Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:
More informationRandomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks
2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam
More informationCSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing
Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed
More informationGIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing
GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing Stefan Schmid @ T-Labs, 2011 GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing If you
More informationLecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC
Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview
More informationHyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture
Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture Wei Shi and Pradip K Srimani Department of Computer Science Colorado State University Ft. Collins, CO 80523 Abstract Bounded degree
More informationParallel Architecture. Sathish Vadhiyar
Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate
More informationIntroduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano
Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed
More information