Interconnection Networks
|
|
- Ginger Simmons
- 6 years ago
- Views:
Transcription
1 C/ECE 757: Advanced Computer Architecture II (arallel Computer Architecture) Interconnection Network Design (Chapter 10) Copyright 2001 ark D. Hill University of Wisconsin-adison lides are derived from work by arita Adve (Illinois), Babak Falsafi (CU), Alvy Lebeck (Duke), teve Reinhardt (ichigan), and J.. ingh (rinceton). Thanks! Interconnection Networks Goal: Communication between computers Warning: Terminology-rich environment Focus on Networks for arallel Computing today s ystem Area Networks exhibit many of the same properties Traffic atterns Arbitrary (bisection bandwidth matters) Near-neighbor ermutations (e.g., FFT) ulticasts or Broadcasts? Latency or Bandwidth Critical? (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 1
2 Terms Network characterized by Topology physical structure of the graph Routing Algorithm through which paths can message flow witching trategy How data in message traverses its route Circuit witched vs acket witched Flow Control When does a packet (or portions of it) move along its route (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ore Terms Given topology constructed by linking switches and network interfaces, must deliver packet from node A to node B Link: cable with connectors on each end connect switches to other switches or network interfaces witch: N inputs N outputs (degree N) hit: inimum # of bits physically moved across link in one cycle Flit: inimum # of bits move across link as a single unit acket: unit that requires routing information, some number of flits (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 2
3 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Topology tructure of the interconnect Determines witch Degree: number of links from a node Diameter: number of links crossed between nodes on maximum shortest path Average distance: number of hops to random destination Bisection: minimum number of links that separate the network into two halves Don t forget Network Interface (NI) On- and off-ramp to the interconnect freeway NI latency can dominate interconnect latency (reducing the importance of topology) (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 3
4 Topology Node Node Overhe ad Application W Interface HW Interface link link bandwidth Application W Interface HW Interface link link bandwidth Interconnect Bisection Bandwidth Latency (C) 2003 J. E. mith C/ECE Direct interconnect erformance/cost: witch cost: N Wire cost: const Avg. latency: const Bisection B/W: const Not neighbor optimized ay be local optimized Buses emory Capable of broadcast (good for coherence) Bandwidth not scalable => major problem Hierarchical buses?» Bisection B/W remains constant» Becomes neighbor optimized cache roc emory cache roc emory cache roc emory cache roc (C) 2003 J. E. mith C/ECE age 4
5 Rings Direct interconnect erformance/cost: witch cost: N Wire cost: N Avg. latency: N / 2 Bisection B/W: const neighbor optimized, if bi-directional probably local optimized RING Not easily scalable Hierarchical rings?» Bisection B/W remains constant» Becomes neighbor optimized (C) 2003 J. E. mith C/ECE Hypercubes n-dimensional unit cube Direct interconnect erformance/cost: witch cost: N log 2 N Wire cost: (N log 2 N) /2 Avg. latency: (log 2 N) /2 Bisection B/W: N /2 neighbor optimized probably local optimized latency and bandwidth scale well, BUT individual switch complexity grows => max size is built in (C) 2003 J. E. mith C/ECE age 5
6 2D Torus Direct interconnect erformance/cost: witch cost: N Wire cost: 2 N Avg. latency: N 1/2 Bisection B/W: 2 N 1/2 neighbor optimized probably local optimized (C) 2003 J. E. mith C/ECE D Torus, contd. Cost scales well Latency and bandwidth do not scale as well as hypercube, BUT difference is relatively small for practical-sized systems Weave nodes to make inter-node latencies const. Routing can be tricky (deadlocks) 2D esh similar, but without wraparound (C) 2003 J. E. mith C/ECE age 6
7 Direct interconnect erformance/cost: witch cost: N Wire cost: 3 N Avg. latency: 3(N 1/3 / 2) Bisection B/W: 2 N 2/3 neighbor optimized probably local optimized 3D Torus (C) 2003 J. E. mith C/ECE Cost scales well 3D Torus, contd. Latency and bandwidth do not scale as well as hypercube, BUT difference is relatively small for practical-sized systems eems to have become an interconnect of choice: Cray, Intel, Tera, DAH, etc. Routing can be tricky (deadlocks) 3D esh similar, but without wraparound (C) 2003 J. E. mith C/ECE age 7
8 Crossbars Indirect interconnect erformance/cost: witch cost: N 2 Wire cost: 2 N Avg. latency: const Bisection B/W: N Not neighbor optimized Not local optimized emory emory emory emory emory emory emory emory corssbar rocessor rocessor rocessor rocessor (C) 2003 J. E. mith C/ECE Crossbars, contd. Capable of broadcast No network conflicts Cost does not scale well Often implemented with muxes mux mux mux mux (C) 2003 J. E. mith C/ECE age 8
9 ultistage Interconnection Networks (INs) AKA: Banyan, Baseline, Omega networks, etc. Indirect interconnect Crossbars interconnected with shuffles Can be viewed as overlapped UX trees Destination address 0101 specifies the path The shuffle interconnect routes addresses in a binary fashion This can most easily be seen with INs in the form at right 1011 (C) 2003 J. E. mith C/ECE INs, contd. f switch outputs => decode log 2 f bits in switch erformance/cost: witch cost: (N log f N) / f Wire cost: N log f N Avg. latency: log f N Bisection B/W: N Not neighbor optimized roblem (in combination with latency growth) Not local optimized Capable of broadcast Commonly used in large UA systems (C) 2003 J. E. mith C/ECE age 9
10 ultistage Nets, Equivalence By rearranging switches, multistage nets can be shown to be equivalent A B C D E F G H A B C D E G F H A I A I B J B J C K C D L D N E E K F N F L G O G O H H (C) 2003 J. E. mith C/ECE Fat Trees Tree-like, with constant bandwidth at all levels Closely related to INs Indirect interconnect erformance/cost: witch cost: N log f N Wire cost: f N log f N Avg. latency: approx 2 log f N Bisection B/W: f N neighbor optimized may be local optimized Capable of broadcast (C) 2003 J. E. mith C/ECE age 10
11 Fat Trees, Contd. The IN-derived Fat Tree, is, in fact, a Fat Tree: However, the switching "nodes" in effect do not have full crossbar connectivity Q R T U V W X Q R T U V W X I J K L N O I J K L N O A B C D E F G H A B C D E F G H (C) 2003 J. E. mith C/ECE Important Topologies N = 1024 Type Degree Diameter Ave Dist Bisection Diam Ave D 1D mesh 2 N-1 2N/3 1 2D mesh 4 2(N 1/2-1) 2N 1/2 / 3 N 1/ D mesh 6 3(N 1/3-1) 3N 1/3 / 3 N 2/3 ~30 ~10 nd mesh 2n n(n 1/n - 1) nn 1/n / 3 N (n-1) / n (N = k n ) Ring 2 N / 2 N/4 2 2D torus 4 N 1/2 N 1/2 / 2 2N 1/ k-ary n-cube 2n n(n 1/n ) nn 1/n / (3D) (N = k n ) nk/2 nk/4 2k n-1 Hypercube n n = LogN n/2 N/ (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 11
12 Topologies (cont) N = 1024 Type Degree Diameter Ave Dist Bisection Diam Ave D 2D Tree 3 2Log 2 N ~2Log 2 N 1 20 ~20 4D Tree 5 2Log 4 N 2Log 4 N - 2/ kd k+1 Log k N 2D fat tree 4 Log 2 N N 2D butterfly 4 Log 2 N Log 2 N N/ C-5 Thinned Fat Tree (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Butterfly ultistage: nodes at ends, switches in middle N/2 Butterfly N/2 Butterfly All paths equal length Unique path from any input to any output Conflicts cause tree saturation Benes Network N Butterfly Reversed N Butterfly Routes all permutations w/o conflict Notice similarity to Fat Tree (Fold in half) Randomization is major breakthrough (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 12
13 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ABCs of Networks tarting oint: end bits between 2 computers Queue on each end Can send both ways ( Bi-directional, Full Duplex ) Rules for communication? protocol ynchronous send» Need Request & Response signaling Name for standard group of bits sent: acket (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 13
14 A imple Example What is the packet format? Fixed? (for HW Interpretation) Number bytes? Request/ Response Address/Data 1 bit 32 bits 0: lease send data from Address 1: acket contains data corresponding to request (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Questions About imple Example What if more than 2 computers want to communicate? Need node identifier field (destination) in packet Routing and topology What if packet is garbled in transit? Add error detection field in packet (e.g., CRC) What if packet is lost? ore elaborate protocols to detect loss (e.g., NAK, time outs) What if multiple processes/machine? Dispatch Queue per process Questions such as these lead to more complex protocols and packet formats (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 14
15 General acket Format Header routing and control information ayload carries data (non HW specific information) can be further divided (framing, protocol stacks ) Error Code generally at tail of packet so it can be generated on the way out Header ayload Error Code (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE essage vs. acket A essage may be composed of several packets Applications reason about messages Network transfers packets mall fixed size packets. roblems? Fragmentation and reassembly (W overhead) Variable ize packets. roblems? Congestion (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 15
16 acket witched vs Circuit witched Circuit witched Establish Route then end Data Telephone system acket witched Route each packet individually Delivery Guarantees Reliable In order, what if not? (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE tore-and-forward Cut-through Virtual cut-through Wormhole Routing (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 16
17 tore and Forward tore-and-forward policy: each switch waits for the full packet to arrive in the switch before it is sent on to the next switch (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Cut Through Cut-through routing: switch examines the header, decides where to send the message, and then starts forwarding it immediately (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 17
18 Virtual Cut-Through What to do if output port is blocked? Lets the tail continue when the head is blocked, absorbing the whole message into a single switch. Requires a buffer large enough to hold the largest packet. Degenerates to store-and-forward with high contention (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Wormhole When the head of the message is blocked the message stays strung out over the network otentially blocks other messages (needs only buffer the piece of the packet that is sent between switches). C-5 used it, with each switch buffer being 4 bits per port. yrinet uses it Interaction with ack ize Can cause tree saturation (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 18
19 tore and Forward vs. Cut-Through Advantage Latency reduces from function of: number of intermediate switches times the size of the packet to time for 1st part of the packet to negotiate the switches + the packet size interconnect BW (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Routing Algorithm How do I know where a packet should go? Arithmetic ource-based Table Lookup Adaptive route based on network state (e.g., contention) (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 19
20 Arithmetic Routing For regular topology, simple arithmetic to determine route 2D esh (Also called NEW network) packet header contains signed offset to destination switch ++ or -- one field of header (x or y dimension) when x == 0 and y == 0, then at correct processor Requires ALU in switch ust recompute CRC (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE ource Based and Table Lookup Routing ource Based ource specifies output port for each switch in route Very imple witches no control state strip output port off header yrinet uses this Table Lookup Very mall Header, index into table for output port Big tables, must be kept up to date... (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 20
21 Deterministic v.s. Adaptive Routing Deterministic follows a prespecified route mesh: dimension-order routing» (x1, y1) -> (x2, y2)» first Dx = x2 - x1,» then Dy = y2 - y1, hypercube: edge-cube routing» X = x0x1x2...xn -> Y = y0y1y2...yn» R = X xor Y» Traverse dimensions of differing address in order tree: common ancestor Adaptive route determined by contention for output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Adaptive Routing Essential for fault tolerance at least multipath Can improve utilization of the network imple deterministic algorithms easily run into bad permutations Fully/partially adaptive, minimal/non-minimal Can introduce complexity or anomalies Little adaptation goes a long way! (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 21
22 Deadlock Break a Necessary Condition Use more than one resource Not willing to release resource in use Cycle in order of recourse use Guri ohi (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Deadlock Free Routing Virtual Channels Not virtual cut-through Add buffers so flits of wormhole packets can be interleaved Up*-Down* Number switches: higher = farther away from processors route up, make one turn, route down Turn odel Routing Restrict order of turns» West First» North Last» Negative First Can increase number of hops (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 22
23 inimal turn restrictions in 2D +y -x +x West-first north-last -y negative first (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 23
24 A Generic witch At minimum, must route inputs to outputs Receiver Input Buffer Output Buffer Transmitter Input orts Cross-bar Output orts Control Routing, cheduling VLI makes it easier to create larger fully connected switches (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE witch Design Issues ports, pin limited data path (cross bar designs) non-blocking crossbar routing logic per input ALU table Finite tate achine for cut-through (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 24
25 witch Buffering need to absorb some transient peaks shared buffer pool need high bandwidth one output could hog all buffer space input buffers (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Input Buffering buffer per port routing logic head of line (HOL) blocking subsequent packet may be routed to unused output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 25
26 Output Buffering Buffers logically associated with output split on either side of cross bar Arbitration for physical link (output scheduling) static priority random round-robin oldest-first Effects of adaptive routing? elect output based on availability requires feedback from output port (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE tacked Dimension witch Uses only 2x2 switch to build higher dimension switch Host out z in 2x2 z out y in 2x2 y out x in 2x2 x out Host in (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 26
27 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Congestion Control acket switched networks do not reserve bandwidth; this leads to contention olution: prevent packets from entering until contention is reduced (e.g., metering lights) Options: End-to-end Flow Control Link-level Flow Control (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 27
28 Flow Control acket discarding: If a packet arrives at a switch and there is no room in the buffer, the packet is discarded no communication between switches, requires higher level protocol Flow control: between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Link-Level Flow Control Ready Data Transfer single flit when receiver is ready Could have long links with many flits in flight (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 28
29 Credit-based (Window) Flow Control Receiver gives N credits to sender sender decrements count stops sending if zero receiver sends back credit as it drains its buffer bundle credits to reduce overhead ust account for link latency (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Water Level high water, low water stop & go back to source switch (yrinet) can send redundant stop/go Incoming phits top Go Outgoing phits (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 29
30 Outline Topology witching, Routing, & Deadlock witch Design Flow Control Case tudies (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Case tudy Cray T3D 1024 switch nodes each connected to 2 processors 3D Torus, bidirectional, 300 B/s Link: 16 bits, 8 control bits Variable size packet (multiple of 16 bits) Logical request & response networks 2 virtual channels each for deadlock tacked dimension routing Wormhole for large packets, virtual cut-through for small packets (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 30
31 IB -2 (Vulcan) witch has eight bidirectional 40 B/s links Link: 8 data bits, 1 tag, 1 reverse flow-control Flit is 16 bits, phit is 8 input FIFO + output FIFO + central buffer byte segments (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE Real achines Wide links, smaller routing delay Tremendous variation (C) 2001 ark D. Hill from Adve,Falsafi, Lebeck, Reinhardt, & ingh C/ECE age 31
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized
More informationRouting Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)
Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup
More informationLecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue
More informationECE 669 Parallel Computer Architecture
ECE 669 Parallel Computer Architecture Lecture 21 Routing Outline Routing Switch Design Flow Control Case Studies Routing Routing algorithm determines which of the possible paths are used as routes how
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationNOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms
Outline Networks: Routing and Design Routing Switch Design Case Studies CS 5, Spring 99 David E. Culler Computer Science Division U.C. Berkeley 3/3/99 CS5 S99 Routing Recall: routing algorithm determines
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationLecture: Interconnection Networks
Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More information4. Networks. in parallel computers. Advances in Computer Architecture
4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors
More informationCMSC 611: Advanced. Interconnection Networks
CMSC 611: Advanced Computer Architecture Interconnection Networks Interconnection Networks Massively parallel processor networks (MPP) Thousands of nodes Short distance (
More informationInterconnection Networks
Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact
More informationCS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2
Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99
More informationTDT Appendix E Interconnection Networks
TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages
More informationLecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control
Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationInterconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.
Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,
More informationLecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control
Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,
More informationInterconnection networks
Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory
More informationRouting Algorithms. Review
Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent
More informationPhysical Organization of Parallel Platforms. Alexandre David
Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:
More informationLecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance
Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,
More informationCommunication Performance in Network-on-Chips
Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In
More informationLecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background
Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation
More informationModule 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth
Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012
More informationLecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996
Lecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996 DAP.F96 1 Review: Memory Hierarchies File Cache Hard Disk Tapes On-Line
More informationInterconnection Networks. Issues for Networks
Interconnection Networks Communications Among Processors Chris Nevison, Colgate University Issues for Networks Total Bandwidth amount of data which can be moved from somewhere to somewhere per unit time
More informationLecture 16: On-Chip Networks. Topics: Cache networks, NoC basics
Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality
More informationJ. E. Smith. Introduction Software Scaling Hardware Scaling Case studies MIT J Machine Cray T3D Cray T3E* CM-5**
Lecture Outline assively arallel rocessors ECE/C 757 pring 2007 J. E. mith Copyright (C) 2007 by James E. mith (unless noted otherwise) All rights reserved. Except for use in ECE/C 757, no part of these
More informationLecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels
Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a
More informationCS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011
CS252 Graduate Computer Architecture Lecture 14 Multiprocessor Networks March 9 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationNetwork Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)
Network Properties, Scalability and Requirements For Parallel Processing Scalable Parallel Performance: Continue to achieve good parallel performance "speedup"as the sizes of the system/problem are increased.
More informationCS521 CSE IITG 11/23/2012
A ahu 1 Topology : who is connected to whom? Direct / Indirect : where is switching done? tatic / Dynamic : when is switching done? Circuit switching / packet switching : how are connections estalished?
More informationInterconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection
More informationInterconnection Networks
Lecture 18: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of these slides were created by Michael Papamichael This lecture is partially
More informationNetwork Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)
Network Properties, Scalability and Requirements For Parallel Processing Scalable Parallel Performance: Continue to achieve good parallel performance "speedup"as the sizes of the system/problem are increased.
More informationCS575 Parallel Processing
CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.
More informationChapter 9 Multiprocessors
ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More informationBasic Low Level Concepts
Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock
More informationInterconnection Network
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationINTERCONNECTION NETWORKS LECTURE 4
INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source
More informationDistributed Memory Machines. Distributed Memory Machines
Distributed Memory Machines Arvind Krishnamurthy Fall 2004 Distributed Memory Machines Intel Paragon, Cray T3E, IBM SP Each processor is connected to its own memory and cache: cannot directly access another
More informationMultiprocessor Interconnection Networks
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical
More informationNOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability
Recap: Gigaplane Bus Timing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Address Rd A Rd B Scalability State Arbitration 1 4,5 2 Share ~Own 6 Own 7 A D A D A D A D A D A D A D A D CS 258, Spring 99 David E. Culler
More informationLecture 18: Communication Models and Architectures: Interconnection Networks
Design & Co-design of Embedded Systems Lecture 18: Communication Models and Architectures: Interconnection Networks Sharif University of Technology Computer Engineering g Dept. Winter-Spring 2008 Mehdi
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationPacket Switch Architecture
Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.
More informationPacket Switch Architecture
Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.
More informationDr e v prasad Dt
Dr e v prasad Dt. 12.10.17 Contents Characteristics of Multiprocessors Interconnection Structures Inter Processor Arbitration Inter Processor communication and synchronization Cache Coherence Introduction
More informationEECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal
Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,
More informationCSC630/CSC730: Parallel Computing
CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control
More informationInterconnect Technology and Computational Speed
Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented
More information[ ] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering.
Switch Design [ 10.3.2] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering. Here is a basic diagram of a switch. Receiver
More informationEE382 Processor Design. Illinois
EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate
More informationInternet II. CS10 : Beauty and Joy of Computing. cs10.berkeley.edu. !!Senior Lecturer SOE Dan Garcia!!! Garcia UCB!
cs10.berkeley.edu CS10 : Beauty and Joy of Computing Internet II!!Senior Lecturer SOE Dan Garcia!!!www.cs.berkeley.edu/~ddgarcia CS10 L17 Internet II (1)! Why Networks?! Originally sharing I/O devices
More informationECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts
ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running
More informationII. Principles of Computer Communications Network and Transport Layer
II. Principles of Computer Communications Network and Transport Layer A. Internet Protocol (IP) IPv4 Header An IP datagram consists of a header part and a text part. The header has a 20-byte fixed part
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #5 1/29/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 From last class Outline
More informationL14-15 Scalable Interconnection Networks. Scalable, High Performance Network
L14-15 Scalable Interconnection Networks 1 Scalable, High Performance Network At Core of Parallel Computer Architecture Requirements and trade-offs at many levels Elegant mathematical structure Deep relationships
More informationNetworks. Distributed Systems. Philipp Kupferschmied. Universität Karlsruhe, System Architecture Group. May 6th, 2009
Networks Distributed Systems Philipp Kupferschmied Universität Karlsruhe, System Architecture Group May 6th, 2009 Philipp Kupferschmied Networks 1/ 41 1 Communication Basics Introduction Layered Communication
More informationInterconnection Networks
Lecture 15: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Credit: some slides created by Michael Papamichael, others based on slides from Onur Mutlu
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #4 1/24/2018 Xuehai Qian xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Announcements PA #1
More informationParallel Computing Interconnection Networks
Parallel Computing Interconnection Networks Readings: Hager s book (4.5) Pacheco s book (chapter 2.3.3) http://pages.cs.wisc.edu/~tvrdik/5/html/section5.html#aaaaatre e-based topologies Slides credit:
More informationParallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved.
Parallel Systems Prof. James L. Frankel Harvard University Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved. Architectures SISD (Single Instruction, Single Data)
More informationLecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks
Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock
More informationAdvanced Computer Networks Data Center Architecture. Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015
Advanced Computer Networks 263-3825-00 Data Center Architecture Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 1 MORE ABOUT TOPOLOGIES 2 Bisection Bandwidth Bisection bandwidth: Sum of the
More informationEE 382C Interconnection Networks
EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationInterprocessor Communication. Basics of Network Routing
Interprocessor Communication There are two main differences between sequential computers and parallel computers -- multiple processors and the hardware to connect them together. That hardware is the most
More informationCharacteristics of Mult l ip i ro r ce c ssors r
Characteristics of Multiprocessors A multiprocessor system is an interconnection of two or more CPUs with memory and input output equipment. The term processor in multiprocessor can mean either a central
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:
More informationLecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics
Lecture 14: Large Cache Design III Topics: Replacement policies, associativity, cache networks, networking basics 1 LIN Qureshi et al., ISCA 06 Memory level parallelism (MLP): number of misses that simultaneously
More informationPortland State University ECE 588/688. Directory-Based Cache Coherence Protocols
Portland State University ECE 588/688 Directory-Based Cache Coherence Protocols Copyright by Alaa Alameldeen and Haitham Akkary 2018 Why Directory Protocols? Snooping-based protocols may not scale All
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationEEC-484/584 Computer Networks
EEC-484/584 Computer Networks Lecture 13 wenbing@ieee.org (Lecture nodes are based on materials supplied by Dr. Louise Moser at UCSB and Prentice-Hall) Outline 2 Review of lecture 12 Routing Congestion
More informationHigh Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks
High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing (SCCS)
More informationInterconnection Networks: Flow Control. Prof. Natalie Enright Jerger
Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:
More informationParallel Computer Architecture II
Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de
More information18-740/640 Computer Architecture Lecture 16: Interconnection Networks. Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015
18-740/640 Computer Architecture Lecture 16: Interconnection Networks Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015 Required Readings Required Reading Assignment: Dubois, Annavaram,
More informationParallel Architecture. Sathish Vadhiyar
Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate
More informationDeadlock and Router Micro-Architecture
1 EE482: Advanced Computer Organization Lecture #8 Interconnection Network Architecture and Design Stanford University 22 April 1999 Deadlock and Router Micro-Architecture Lecture #8: 22 April 1999 Lecturer:
More informationLecture 3: Flow-Control
High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationDeadlock and Livelock. Maurizio Palesi
Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationGrowth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.
Internetworking Multiple networks are a fact of life: Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Fault isolation,
More informationEECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture
EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley,
More informationLearning Curve for Parallel Applications. 500 Fastest Computers
Learning Curve for arallel Applications ABER molecular dynamics simulation program Starting point was vector code for Cray-1 145 FLO on Cray90, 406 for final version on 128-processor aragon, 891 on 128-processor
More informationCSC 4900 Computer Networks: Network Layer
CSC 4900 Computer Networks: Network Layer Professor Henry Carter Fall 2017 Villanova University Department of Computing Sciences Review What is AIMD? When do we use it? What is the steady state profile
More informationLecture 16: Network Layer Overview, Internet Protocol
Lecture 16: Network Layer Overview, Internet Protocol COMP 332, Spring 2018 Victoria Manfredi Acknowledgements: materials adapted from Computer Networking: A Top Down Approach 7 th edition: 1996-2016,
More informationGeneric Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture
Generic Architecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationChapter 3 : Topology basics
1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000 2 Network topology (1) It corresponds to the static arrangement
More informationFault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson
Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip
More informationWormhole Routing Techniques for Directly Connected Multicomputer Systems
Wormhole Routing Techniques for Directly Connected Multicomputer Systems PRASANT MOHAPATRA Iowa State University, Department of Electrical and Computer Engineering, 201 Coover Hall, Iowa State University,
More informationFrom Routing to Traffic Engineering
1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers
More informationJUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS
1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS
More information