Interconnection Networks

Size: px
Start display at page:

Download "Interconnection Networks"

Transcription

1 C/ECE 757: Advanced Computer Architecture II (arallel Computer Architecture) Interconnection Network Design (Chapter 10) Copyright 2001 ark D. Hill University of Wisconsin-adison lides are derived from work by arita Adve (Illinois), Babak Falsafi (CU), Alvy Lebeck (Duke), teve Reinhardt (ichigan), and J.. ingh (rinceton). Thanks! Interconnection Networks Goal: Communication between computers Warning: Terminology-rich environment Focus on Networks for arallel Computing today s ystem Area Networks exhibit many of the same properties Traffic atterns Arbitrary (bisection bandwidth matters) Near-neighbor ermutations (e.g., FFT) ulticasts or Broadcasts? Latency or Bandwidth Critical? (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 1

2 Terms Network characterized by Topology physical structure of the graph Routing Algorithm through which paths can message flow witching trategy How data in message traverses its route Circuit witched vs acket witched Flow Control When does a packet (or portions of it) move along its route (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ore Terms Given topology constructed by linking switches and network interfaces, must deliver packet from node A to node B Link: cable with connectors on each end connect switches to other switches or network interfaces witch: N inputs N outputs (degree N) hit: inimum # of bits physically moved across link in one cycle Flit: inimum # of bits move across link as a single unit acket: unit that requires routing information, some number of flits (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 2

3 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Topology tructure of the interconnect Determines witch Degree: number of links from a node Diameter: number of links crossed between nodes on maximum shortest path Average distance: number of hops to random destination Bisection: minimum number of links that separate the network into two halves Don t forget Network Interface (NI) On- and off-ramp to the interconnect freeway NI latency can dominate interconnect latency (reducing the importance of topology) (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 3

4 Topology Node Node Overhe ad Application W Interface HW Interface link link bandwidth Application W Interface HW Interface link link bandwidth Interconnect Bisection Bandwidth Latency (C) 2003 J. E. mith C/ECE Direct interconnect erformance/cost: witch cost: N Wire cost: const Avg. latency: const Bisection B/W: const Not neighbor optimized ay be local optimized Buses emory Capable of broadcast (good for coherence) Bandwidth not scalable => major problem Hierarchical buses?» Bisection B/W remains constant» Becomes neighbor optimized cache roc emory cache roc emory cache roc emory cache roc (C) 2003 J. E. mith C/ECE age 4

5 Rings Direct interconnect erformance/cost: witch cost: N Wire cost: N Avg. latency: N / 2 Bisection B/W: const neighbor optimized, if bi-directional probably local optimized RING Not easily scalable Hierarchical rings?» Bisection B/W remains constant» Becomes neighbor optimized (C) 2003 J. E. mith C/ECE Hypercubes n-dimensional unit cube Direct interconnect erformance/cost: witch cost: N log 2 N Wire cost: (N log 2 N) /2 Avg. latency: (log 2 N) /2 Bisection B/W: N /2 neighbor optimized probably local optimized latency and bandwidth scale well, BUT individual switch complexity grows => max size is built in (C) 2003 J. E. mith C/ECE age 5

6 2D Torus Direct interconnect erformance/cost: witch cost: N Wire cost: 2 N Avg. latency: N 1/2 Bisection B/W: 2 N 1/2 neighbor optimized probably local optimized (C) 2003 J. E. mith C/ECE D Torus, contd. Cost scales well Latency and bandwidth do not scale as well as hypercube, BUT difference is relatively small for practical-sized systems Weave nodes to make inter-node latencies const. Routing can be tricky (deadlocks) 2D esh similar, but without wraparound (C) 2003 J. E. mith C/ECE age 6

7 Direct interconnect erformance/cost: witch cost: N Wire cost: 3 N Avg. latency: 3(N 1/3 / 2) Bisection B/W: 2 N 2/3 neighbor optimized probably local optimized 3D Torus (C) 2003 J. E. mith C/ECE Cost scales well 3D Torus, contd. Latency and bandwidth do not scale as well as hypercube, BUT difference is relatively small for practical-sized systems eems to have become an interconnect of choice: Cray, Intel, Tera, DAH, etc. Routing can be tricky (deadlocks) 3D esh similar, but without wraparound (C) 2003 J. E. mith C/ECE age 7

8 Crossbars Indirect interconnect erformance/cost: witch cost: N 2 Wire cost: 2 N Avg. latency: const Bisection B/W: N Not neighbor optimized Not local optimized emory emory emory emory emory emory emory emory corssbar rocessor rocessor rocessor rocessor (C) 2003 J. E. mith C/ECE Crossbars, contd. Capable of broadcast No network conflicts Cost does not scale well Often implemented with muxes mux mux mux mux (C) 2003 J. E. mith C/ECE age 8

9 ultistage Interconnection Networks (INs) AKA: Banyan, Baseline, Omega networks, etc. Indirect interconnect Crossbars interconnected with shuffles Can be viewed as overlapped UX trees Destination address 0101 specifies the path The shuffle interconnect routes addresses in a binary fashion This can most easily be seen with INs in the form at right 1011 (C) 2003 J. E. mith C/ECE INs, contd. f switch outputs => decode log 2 f bits in switch erformance/cost: witch cost: (N log f N) / f Wire cost: N log f N Avg. latency: log f N Bisection B/W: N Not neighbor optimized roblem (in combination with latency growth) Not local optimized Capable of broadcast Commonly used in large UA systems (C) 2003 J. E. mith C/ECE age 9

10 ultistage Nets, Equivalence By rearranging switches, multistage nets can be shown to be equivalent A B C D E F G H A B C D E G F H A I A I B J B J C K C D L D N E E K F N F L G O G O H H (C) 2003 J. E. mith C/ECE Fat Trees Tree-like, with constant bandwidth at all levels Closely related to INs Indirect interconnect erformance/cost: witch cost: N log f N Wire cost: f N log f N Avg. latency: approx 2 log f N Bisection B/W: f N neighbor optimized may be local optimized Capable of broadcast (C) 2003 J. E. mith C/ECE age 10

11 Fat Trees, Contd. The IN-derived Fat Tree, is, in fact, a Fat Tree: However, the switching "nodes" in effect do not have full crossbar connectivity Q R T U V W X Q R T U V W X I J K L N O I J K L N O A B C D E F G H A B C D E F G H (C) 2003 J. E. mith C/ECE Important Topologies N = 1024 Type Degree Diameter Ave Dist Bisection Diam Ave D 1D mesh 2 N-1 2N/3 1 2D mesh 4 2(N 1/2-1) 2N 1/2 / 3 N 1/ D mesh 6 3(N 1/3-1) 3N 1/3 / 3 N 2/3 ~30 ~10 nd mesh 2n n(n 1/n - 1) nn 1/n / 3 N (n-1) / n (N = k n ) Ring 2 N / 2 N/4 2 2D torus 4 N 1/2 N 1/2 / 2 2N 1/ k-ary n-cube 2n n(n 1/n ) nn 1/n / (3D) (N = k n ) nk/2 nk/4 2k n-1 Hypercube n n = LogN n/2 N/ (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 11

12 Topologies (cont) N = 1024 Type Degree Diameter Ave Dist Bisection Diam Ave D 2D Tree 3 2Log 2 N ~2Log 2 N 1 20 ~20 4D Tree 5 2Log 4 N 2Log 4 N - 2/ kd k+1 Log k N 2D fat tree 4 Log 2 N N 2D butterfly 4 Log 2 N Log 2 N N/ C-5 Thinned Fat Tree (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Butterfly ultistage: nodes at ends, switches in middle N/2 Butterfly N/2 Butterfly All paths equal length Unique path from any input to any output Conflicts cause tree saturation Benes Network N Butterfly Reversed N Butterfly Routes all permutations w/o conflict Notice similarity to Fat Tree (Fold in half) Randomization is major breakthrough (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 12

13 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ABCs of Networks tarting oint: end bits between 2 computers Queue on each end Can send both ways ( Bi-directional, Full Duplex ) Rules for communication? protocol ynchronous send» Need Request & Response signaling Name for standard group of bits sent: acket (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 13

14 A imple Example What is the packet format? Fixed? (for HW Interpretation) Number bytes? Request/ Response Address/Data 1 bit 32 bits 0: lease send data from Address 1: acket contains data corresponding to request (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Questions About imple Example What if more than 2 computers want to communicate? Need node identifier field (destination) in packet Routing and topology What if packet is garbled in transit? Add error detection field in packet (e.g., CRC) What if packet is lost? ore elaborate protocols to detect loss (e.g., NAK, time outs) What if multiple processes/machine? Dispatch Queue per process Questions such as these lead to more complex protocols and packet formats (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 14

15 General acket Format Header routing and control information ayload carries data (non HW specific information) can be further divided (framing, protocol stacks ) Error Code generally at tail of packet so it can be generated on the way out Header ayload Error Code (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE essage vs. acket A essage may be composed of several packets Applications reason about messages Network transfers packets mall fixed size packets. roblems? Fragmentation and reassembly (W overhead) Variable ize packets. roblems? Congestion (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 15

16 acket witched vs Circuit witched Circuit witched Establish Route then end Data Telephone system acket witched Route each packet individually Delivery Guarantees Reliable In order, what if not? (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE tore-and-forward Cut-through Virtual cut-through Wormhole Routing (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 16

17 tore and Forward tore-and-forward policy: each switch waits for the full packet to arrive in the switch before it is sent on to the next switch (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Cut Through Cut-through routing: switch examines the header, decides where to send the message, and then starts forwarding it immediately (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 17

18 Virtual Cut-Through What to do if output port is blocked? Lets the tail continue when the head is blocked, absorbing the whole message into a single switch. Requires a buffer large enough to hold the largest packet. Degenerates to store-and-forward with high contention (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Wormhole When the head of the message is blocked the message stays strung out over the network otentially blocks other messages (needs only buffer the piece of the packet that is sent between switches). C-5 used it, with each switch buffer being 4 bits per port. yrinet uses it Interaction with ack ize Can cause tree saturation (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 18

19 tore and Forward vs. Cut-Through Advantage Latency reduces from function of: number of intermediate switches times the size of the packet to time for 1st part of the packet to negotiate the switches + the packet size interconnect BW (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Routing Algorithm How do I know where a packet should go? Arithmetic ource-based Table Lookup Adaptive route based on network state (e.g., contention) (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 19

20 Arithmetic Routing For regular topology, simple arithmetic to determine route 2D esh (Also called NEW network) packet header contains signed offset to destination switch ++ or -- one field of header (x or y dimension) when x == 0 and y == 0, then at correct processor Requires ALU in switch ust recompute CRC (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ource Based and Table Lookup Routing ource Based ource specifies output port for each switch in route Very imple witches no control state strip output port off header yrinet uses this Table Lookup Very mall Header, index into table for output port Big tables, must be kept up to date... (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 20

21 Deterministic v.s. Adaptive Routing Deterministic follows a prespecified route mesh: dimension-order routing» (x1, y1) -> (x2, y2)» first Dx = x2 - x1,» then Dy = y2 - y1, hypercube: edge-cube routing» X = x0x1x2...xn -> Y = y0y1y2...yn» R = X xor Y» Traverse dimensions of differing address in order tree: common ancestor Adaptive route determined by contention for output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Adaptive Routing Essential for fault tolerance at least multipath Can improve utilization of the network imple deterministic algorithms easily run into bad permutations Fully/partially adaptive, minimal/non-minimal Can introduce complexity or anomalies Little adaptation goes a long way! (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 21

22 Deadlock Break a Necessary Condition Use more than one resource Not willing to release resource in use Cycle in order of recourse use Guri ohi (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Deadlock Free Routing Virtual Channels Not virtual cut-through Add buffers so flits of wormhole packets can be interleaved Up*-Down* Number switches: higher = farther away from processors route up, make one turn, route down Turn odel Routing Restrict order of turns» West First» North Last» Negative First Can increase number of hops (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 22

23 inimal turn restrictions in 2D +y -x +x West-first north-last -y negative first (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 23

24 A Generic witch At minimum, must route inputs to outputs Receiver Input Buffer Output Buffer Transmitter Input orts Cross-bar Output orts Control Routing, cheduling VLI makes it easier to create larger fully connected switches (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE witch Design Issues ports, pin limited data path (cross bar designs) non-blocking crossbar routing logic per input ALU table Finite tate achine for cut-through (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 24

25 witch Buffering need to absorb some transient peaks shared buffer pool need high bandwidth one output could hog all buffer space input buffers (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Input Buffering buffer per port routing logic head of line (HOL) blocking subsequent packet may be routed to unused output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 25

26 Output Buffering Buffers logically associated with output split on either side of cross bar Arbitration for physical link (output scheduling) static priority random round-robin oldest-first Effects of adaptive routing? elect output based on availability requires feedback from output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE tacked Dimension witch Uses only 2x2 switch to build higher dimension switch Host out z in 2x2 z out y in 2x2 y out x in 2x2 x out Host in (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 26

27 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Congestion Control acket switched networks do not reserve bandwidth; this leads to contention olution: prevent packets from entering until contention is reduced (e.g., metering lights) Options: End-to-end Flow Control Link-level Flow Control (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 27

28 Flow Control acket discarding: If a packet arrives at a switch and there is no room in the buffer, the packet is discarded no communication between switches, requires higher level protocol Flow control: between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Link-Level Flow Control Ready Data Transfer single flit when receiver is ready Could have long links with many flits in flight (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 28

29 Credit-based (Window) Flow Control Receiver gives N credits to sender sender decrements count stops sending if zero receiver sends back credit as it drains its buffer bundle credits to reduce overhead ust account for link latency (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Water Level high water, low water stop & go back to source switch (yrinet) can send redundant stop/go Incoming phits top Go Outgoing phits (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 29

30 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Case tudy Cray T3D 1024 switch nodes each connected to 2 processors 3D Torus, bidirectional, 300 B/s Link: 16 bits, 8 control bits Variable size packet (multiple of 16 bits) Logical request & response networks 2 virtual channels each for deadlock tacked dimension routing Wormhole for large packets, virtual cut-through for small packets (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 30

31 IB -2 (Vulcan) witch has eight bidirectional 40 B/s links Link: 8 data bits, 1 tag, 1 reverse flow-control Flit is 16 bits, phit is 8 input FIFO + output FIFO + central buffer byte segments (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Real achines Wide links, smaller routing delay Tremendous variation (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 31

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 21 Routing Outline Routing Switch Design Flow Control Case Studies Routing Routing algorithm determines which of the possible paths are used as routes how

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms Outline Networks: Routing and Design Routing Switch Design Case Studies CS 5, Spring 99 David E. Culler Computer Science Division U.C. Berkeley 3/3/99 CS5 S99 Routing Recall: routing algorithm determines

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

CMSC 611: Advanced. Interconnection Networks

CMSC 611: Advanced. Interconnection Networks CMSC 611: Advanced Computer Architecture Interconnection Networks Interconnection Networks Massively parallel processor networks (MPP) Thousands of nodes Short distance (

More information

Interconnection Networks

Interconnection Networks Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact

More information

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2 Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99

More information

TDT Appendix E Interconnection Networks

TDT Appendix E Interconnection Networks TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N. Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,

More information

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,

More information

Interconnection networks

Interconnection networks Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

Physical Organization of Parallel Platforms. Alexandre David

Physical Organization of Parallel Platforms. Alexandre David Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Communication Performance in Network-on-Chips

Communication Performance in Network-on-Chips Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In

More information

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

Lecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996

Lecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996 Lecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996 DAP.F96 1 Review: Memory Hierarchies File Cache Hard Disk Tapes On-Line

More information

Interconnection Networks. Issues for Networks

Interconnection Networks. Issues for Networks Interconnection Networks Communications Among Processors Chris Nevison, Colgate University Issues for Networks Total Bandwidth amount of data which can be moved from somewhere to somewhere per unit time

More information

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality

More information

J. E. Smith. Introduction Software Scaling Hardware Scaling Case studies MIT J Machine Cray T3D Cray T3E* CM-5**

J. E. Smith. Introduction Software Scaling Hardware Scaling Case studies MIT J Machine Cray T3D Cray T3E* CM-5** Lecture Outline assively arallel rocessors ECE/C 757 pring 2007 J. E. mith Copyright (C) 2007 by James E. mith (unless noted otherwise) All rights reserved. Except for use in ECE/C 757, no part of these

More information

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a

More information

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011 CS252 Graduate Computer Architecture Lecture 14 Multiprocessor Networks March 9 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Lecture 2: Topology - I

Lecture 2: Topology - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and

More information

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA) Network Properties, Scalability and Requirements For Parallel Processing Scalable Parallel Performance: Continue to achieve good parallel performance "speedup"as the sizes of the system/problem are increased.

More information

CS521 CSE IITG 11/23/2012

CS521 CSE IITG 11/23/2012 A ahu 1 Topology : who is connected to whom? Direct / Indirect : where is switching done? tatic / Dynamic : when is switching done? Circuit switching / packet switching : how are connections estalished?

More information

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection

More information

Interconnection Networks

Interconnection Networks Lecture 18: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of these slides were created by Michael Papamichael This lecture is partially

More information

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA) Network Properties, Scalability and Requirements For Parallel Processing Scalable Parallel Performance: Continue to achieve good parallel performance "speedup"as the sizes of the system/problem are increased.

More information

CS575 Parallel Processing

CS575 Parallel Processing CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.

More information

Chapter 9 Multiprocessors

Chapter 9 Multiprocessors ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University

More information

Lecture 3: Topology - II

Lecture 3: Topology - II ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Interconnection Network

Interconnection Network Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics

More information

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

INTERCONNECTION NETWORKS LECTURE 4

INTERCONNECTION NETWORKS LECTURE 4 INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source

More information

Distributed Memory Machines. Distributed Memory Machines

Distributed Memory Machines. Distributed Memory Machines Distributed Memory Machines Arvind Krishnamurthy Fall 2004 Distributed Memory Machines Intel Paragon, Cray T3E, IBM SP Each processor is connected to its own memory and cache: cannot directly access another

More information

Multiprocessor Interconnection Networks

Multiprocessor Interconnection Networks Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical

More information

NOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability

NOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability Recap: Gigaplane Bus Timing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Address Rd A Rd B Scalability State Arbitration 1 4,5 2 Share ~Own 6 Own 7 A D A D A D A D A D A D A D A D CS 258, Spring 99 David E. Culler

More information

Lecture 18: Communication Models and Architectures: Interconnection Networks

Lecture 18: Communication Models and Architectures: Interconnection Networks Design & Co-design of Embedded Systems Lecture 18: Communication Models and Architectures: Interconnection Networks Sharif University of Technology Computer Engineering g Dept. Winter-Spring 2008 Mehdi

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Dr e v prasad Dt

Dr e v prasad Dt Dr e v prasad Dt. 12.10.17 Contents Characteristics of Multiprocessors Interconnection Structures Inter Processor Arbitration Inter Processor communication and synchronization Cache Coherence Introduction

More information

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,

More information

CSC630/CSC730: Parallel Computing

CSC630/CSC730: Parallel Computing CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

[ ] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering.

[ ] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering. Switch Design [ 10.3.2] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering. Here is a basic diagram of a switch. Receiver

More information

EE382 Processor Design. Illinois

EE382 Processor Design. Illinois EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate

More information

Internet II. CS10 : Beauty and Joy of Computing. cs10.berkeley.edu. !!Senior Lecturer SOE Dan Garcia!!! Garcia UCB!

Internet II. CS10 : Beauty and Joy of Computing. cs10.berkeley.edu. !!Senior Lecturer SOE Dan Garcia!!!  Garcia UCB! cs10.berkeley.edu CS10 : Beauty and Joy of Computing Internet II!!Senior Lecturer SOE Dan Garcia!!!www.cs.berkeley.edu/~ddgarcia CS10 L17 Internet II (1)! Why Networks?! Originally sharing I/O devices

More information

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running

More information

II. Principles of Computer Communications Network and Transport Layer

II. Principles of Computer Communications Network and Transport Layer II. Principles of Computer Communications Network and Transport Layer A. Internet Protocol (IP) IPv4 Header An IP datagram consists of a header part and a text part. The header has a 20-byte fixed part

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #5 1/29/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 From last class Outline

More information

L14-15 Scalable Interconnection Networks. Scalable, High Performance Network

L14-15 Scalable Interconnection Networks. Scalable, High Performance Network L14-15 Scalable Interconnection Networks 1 Scalable, High Performance Network At Core of Parallel Computer Architecture Requirements and trade-offs at many levels Elegant mathematical structure Deep relationships

More information

Networks. Distributed Systems. Philipp Kupferschmied. Universität Karlsruhe, System Architecture Group. May 6th, 2009

Networks. Distributed Systems. Philipp Kupferschmied. Universität Karlsruhe, System Architecture Group. May 6th, 2009 Networks Distributed Systems Philipp Kupferschmied Universität Karlsruhe, System Architecture Group May 6th, 2009 Philipp Kupferschmied Networks 1/ 41 1 Communication Basics Introduction Layered Communication

More information

Interconnection Networks

Interconnection Networks Lecture 15: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Credit: some slides created by Michael Papamichael, others based on slides from Onur Mutlu

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #4 1/24/2018 Xuehai Qian xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Announcements PA #1

More information

Parallel Computing Interconnection Networks

Parallel Computing Interconnection Networks Parallel Computing Interconnection Networks Readings: Hager s book (4.5) Pacheco s book (chapter 2.3.3) http://pages.cs.wisc.edu/~tvrdik/5/html/section5.html#aaaaatre e-based topologies Slides credit:

More information

Parallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved.

Parallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved. Parallel Systems Prof. James L. Frankel Harvard University Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved. Architectures SISD (Single Instruction, Single Data)

More information

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock

More information

Advanced Computer Networks Data Center Architecture. Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015

Advanced Computer Networks Data Center Architecture. Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 Advanced Computer Networks 263-3825-00 Data Center Architecture Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 1 MORE ABOUT TOPOLOGIES 2 Bisection Bandwidth Bisection bandwidth: Sum of the

More information

EE 382C Interconnection Networks

EE 382C Interconnection Networks EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire

More information

Topology basics. Constraints and measures. Butterfly networks.

Topology basics. Constraints and measures. Butterfly networks. EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April

More information

Interprocessor Communication. Basics of Network Routing

Interprocessor Communication. Basics of Network Routing Interprocessor Communication There are two main differences between sequential computers and parallel computers -- multiple processors and the hardware to connect them together. That hardware is the most

More information

Characteristics of Mult l ip i ro r ce c ssors r

Characteristics of Mult l ip i ro r ce c ssors r Characteristics of Multiprocessors A multiprocessor system is an interconnection of two or more CPUs with memory and input output equipment. The term processor in multiprocessor can mean either a central

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:

More information

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics Lecture 14: Large Cache Design III Topics: Replacement policies, associativity, cache networks, networking basics 1 LIN Qureshi et al., ISCA 06 Memory level parallelism (MLP): number of misses that simultaneously

More information

Portland State University ECE 588/688. Directory-Based Cache Coherence Protocols

Portland State University ECE 588/688. Directory-Based Cache Coherence Protocols Portland State University ECE 588/688 Directory-Based Cache Coherence Protocols Copyright by Alaa Alameldeen and Haitham Akkary 2018 Why Directory Protocols? Snooping-based protocols may not scale All

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

EEC-484/584 Computer Networks

EEC-484/584 Computer Networks EEC-484/584 Computer Networks Lecture 13 wenbing@ieee.org (Lecture nodes are based on materials supplied by Dr. Louise Moser at UCSB and Prentice-Hall) Outline 2 Review of lecture 12 Routing Congestion

More information

High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks

High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing (SCCS)

More information

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:

More information

Parallel Computer Architecture II

Parallel Computer Architecture II Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de

More information

18-740/640 Computer Architecture Lecture 16: Interconnection Networks. Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015

18-740/640 Computer Architecture Lecture 16: Interconnection Networks. Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015 18-740/640 Computer Architecture Lecture 16: Interconnection Networks Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015 Required Readings Required Reading Assignment: Dubois, Annavaram,

More information

Parallel Architecture. Sathish Vadhiyar

Parallel Architecture. Sathish Vadhiyar Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate

More information

Deadlock and Router Micro-Architecture

Deadlock and Router Micro-Architecture 1 EE482: Advanced Computer Organization Lecture #8 Interconnection Network Architecture and Design Stanford University 22 April 1999 Deadlock and Router Micro-Architecture Lecture #8: 22 April 1999 Lecturer:

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed

More information

Deadlock and Livelock. Maurizio Palesi

Deadlock and Livelock. Maurizio Palesi Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Internetworking Multiple networks are a fact of life: Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Fault isolation,

More information

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley,

More information

Learning Curve for Parallel Applications. 500 Fastest Computers

Learning Curve for Parallel Applications. 500 Fastest Computers Learning Curve for arallel Applications ABER molecular dynamics simulation program Starting point was vector code for Cray-1 145 FLO on Cray90, 406 for final version on 128-processor aragon, 891 on 128-processor

More information

CSC 4900 Computer Networks: Network Layer

CSC 4900 Computer Networks: Network Layer CSC 4900 Computer Networks: Network Layer Professor Henry Carter Fall 2017 Villanova University Department of Computing Sciences Review What is AIMD? When do we use it? What is the steady state profile

More information

Lecture 16: Network Layer Overview, Internet Protocol

Lecture 16: Network Layer Overview, Internet Protocol Lecture 16: Network Layer Overview, Internet Protocol COMP 332, Spring 2018 Victoria Manfredi Acknowledgements: materials adapted from Computer Networking: A Top Down Approach 7 th edition: 1996-2016,

More information

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture Generic Architecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

Chapter 3 : Topology basics

Chapter 3 : Topology basics 1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000 2 Network topology (1) It corresponds to the static arrangement

More information

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip

More information

Wormhole Routing Techniques for Directly Connected Multicomputer Systems

Wormhole Routing Techniques for Directly Connected Multicomputer Systems Wormhole Routing Techniques for Directly Connected Multicomputer Systems PRASANT MOHAPATRA Iowa State University, Department of Electrical and Computer Engineering, 201 Coover Hall, Iowa State University,

More information

From Routing to Traffic Engineering

From Routing to Traffic Engineering 1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers

More information

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS 1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

More information