Interconnection Networks

Size: px
Start display at page:

Download "Interconnection Networks"

Transcription

1 Interconnection Networks

2 Interconnection Networks Introduction How to connect individual devices together into a group of communicating devices? Device: r r r Component within a computer Single computer System of computers Types of elements: r r r end nodes (device + interface) links interconnection network End Node End Node Interconnection Network End Node Internetworking: interconnection of multiple networks End Node Device Device Device Device SW Interface SW Interface SW Interface SW Interface HW Interface HW Interface HW Interface HW Interface Link Link Link Link Slide 2

3 Interconnection Networks Introduction Interconnection networks should be designed to transfer the maximum amount of information within the least amount of time (and cost, power constraints) so as not to bottleneck the system Slide 3

4 Types of Interconnection Networks Four different domains: r Depending on number & proximity of connected devices On-Chip networks (OCNs or NoCs) r Devices are microarchitectural elements (functional units, register files), caches, directories, processors r Latest systems: dozens, hundreds of devices m Ex: Intel TeraFLOPS research prototypes 80 cores m Xeon Phi 60 cores r Proximity: millimeters Slide 4

5 System/Storage Area Networks (SANs) Multiprocessor and multicomputer systems r Interprocessor and processor-memory interconnections Server and data center environments r Storage and I/O components Hundreds to thousands of devices interconnected r IBM Blue Gene/L supercomputer (64K nodes, each with 2 processors) Maximum interconnect distance r tens of meters (typical) r a few hundred meters (some) m InfiniBand: 120 Gbps over a distance of 300m Examples (standards and proprietary) r InfiniBand, Myrinet, Quadrics, Advanced Switching Interconnect Slide 5

6 Local Area Network (LANs) Interconnect autonomous computer systems Machine room or throughout a building or campus Hundreds of devices interconnected (1,000s with bridging) Maximum interconnect distance r few kilometers r few tens of kilometers (some) Example (most popular): Ethernet, with 10 Gbps over 40Km Slide 6

7 Wide Area Networks (WANs) Interconnect systems distributed across the globe Internetworking support is required Many millions of devices interconnected Maximum interconnect distance r many thousands of kilometers Example: ATM (asynchronous transfer mode) Slide 7

8 Interconnection Network Domains Distance (meters) 5 x x x 10 0 LANs SANs WANs 5 x 10-3 OCNs ,000 10,000 >100,000 Number of devices interconnected Slide 8

9 Focus: On-Chip Networks Slide 9

10 On-Chip Networks (OCN or NoCs) Why On-Chip Network? r Ad-hoc wiring does not scale beyond a small number of cores m Prohibitive area m Long latency OCN offers r scalability r efficient multiplexing of communication r often modular in nature (eases verification) Slide 10

11 Differences between on-chip and off-chip networks Significant research in multi-chassis interconnection networks (off-chip) r Supercomputers r Clusters of workstations r Internet routers Leverage research and insight but r Constraints are different Slide 11

12 Off-chip vs. on-chip Off-chip: I/O bottlenecks r Pin-limited bandwidth r Inherent overheads of off-chip I/O transmission On-chip r Wiring constraints m Metal layer limitations m Horizontal and vertical layout m Short, fixed length m Repeater insertion limits routing of wires q Avoid routing over dense logic q Impact wiring density r Power m Consume 10-15% or more of die power budget r Latency m Different order of magnitude m Routers consume significant fraction of latency Slide 12

13 On-Chip Network Evolution Ad hoc wiring r Small number of nodes Buses and Crossbars r Simplest variant of on-chip networks r Low core counts r Like traditional multiprocessors m Bus traffic quickly saturates with a modest number of cores r Crossbars: higher bandwidth m Poor area and power scaling Slide 13

14 Multicore Examples (1) XBAR Sun Niagara Niagara 2: 8x9 crossbar (area ~= core) Rock: Hierarchical crossbar (5x5 crossbar connecting clusters of 4 cores) Slide 14

15 Multicore Examples (2) RING IBM Cell Element Interconnect Bus r 12 elements r 4 unidirectional rings m 16 Bytes wide m Operates at 1.6 GHz IBM Cell Slide 15

16 Many Core Example 2D MESH Intel TeraFLOPS r 80 core prototype r 5 GHz r Each tile: m Processing engine + on-chip network router Slide 16

17 Many-Core Example (2): Intel SCC Intel s Single-chip Cloud Computer (SCC) uses a 2D mesh with state of the art routers Slide 17

18 Performance and Cost Latency (sec) Zero load latency Performance: latency and throughput Cost: area and power Offered Traffic (bits/sec) Saturation throughput Slide 18

19 Interfaces Topology Routing Flow Control Router Microarchitecture Topics to be covered Slide 19

20 System Interfaces Slide 20

21 Systems and Interfaces Look at how systems interact and interface with network Types of multi-processors r r Shared-memory m From high end servers to embedded products Message passing m Multiprocessor System on Chip (MPSoC) q Mobile consumer market m Clusters We focus on on-chip networks for shared-memory multi-core Slide 21

22 Shared Memory CMP Architecture Core L2 Cache L1 I/D Cache Router Tags Data Controller Logic L2: Private or distributed shared cache Centralized shared cache will have a different organization A tile could be a core or L2 bank Slide 22

23 Impact of Coherence Protocol on Network Performance Coherence protocol shapes communication needed by system Single writer, multiple reader invariant Requires: r r r Data requests Data responses Coherence permissions Slide 23

24 Broadcast vs. Directory Memory Controller 2 Request broadcast 1 Read Cache miss 3 Send Data Directory receives request 2 Directory 1 Read Cache miss 3 Send Data Slide 24

25 Coherence Protocol Requirements Different message types r Unicast, multicast, broadcast Directory protocol r Majority of requests: Unicast m Lower bandwidth demands on network r More scalable due to point-to-point communication Broadcast protocol r Majority of requests: Broadcast m Higher bandwidth demands r Often rely on network ordering Slide 25

26 Protocol Level Deadlock Network End Node Interconnection Network Reply Q Memory / Cache Controller Request Q Request-Reply Dependency r Network becomes flooded with requests that cannot be consumed until the network interface has generated a reply Deadlock dependency between multiple message classes Virtual channels can prevent protocol level deadlock (to be discussed later) Slide 26

27 Home Node/Memory Controller Issues Heterogeneity in network r Some tiles are memory controllers m Co-located with processor/cache or separate tile m Share injection/ejection bandwidth? Home node r Directory coherence information r <= number of tiles Potential hot spots in network? Slide 27

28 Network Interface Slide 28

29 Network Interface: Miss Status Handling Registers Core Cache Request Type Addr Data Type Addr Data Reply Cache Protocol Finite State Machine MSHRs Status Addr Data Message Format and Send To network Dest RdReq Addr RdReply Addr Data Message Receive From network Dest Writeback Addr Data Request Addr Dest Reply Addr Data WriteAck Addr Slide 29

30 Transaction Status Handling Registers (for centralized directory) Src RdReq Addr Src Writeback Addr Data From network Message Receive Dest RdReply Addr Data Dest WriteAck Addr To network Message Format and Send Directory Cache TSHRs Status Src Addr Data Memory Controller Off-chip memory Slide 30

31 MPSoCs Slide 31

32 Synthesized NoCs for MPSoCs System-on-Chip (SoC) r Chips tailored to specific applications or domains r Designed quickly through composition of IP blocks Fundamental NoC concepts applicable to both CMP and MPSoC Key characteristics r Applications known a priori r Automated design process r Standardized interfaces r Area/power constraints tighter Slide 32

33 Application Characterization vld vop memory 70 Inverse scan 362 Run length decode 362 Stripe memory AC/DC predictio n padding 362 Describe application with task graphs Annotate with traffic volumes iquant vop reconstruction 353 Up samp 16 idct 16 ARM Slide 33

34 Design Requirements Less aggressive r CMPs: GHz clock frequencies r MPSoCs: MHz clock frequencies r Pipelining may not be necessary r Standardizes interfaces add significant delay Area and power r CMPs: 100W for server r MPSoC: several watts only Time to market r Automatic composition and generation Slide 34

35 Application NoC Synthesis Input traffic model Codesign simulation Constraint graph Comm graph User objectives: power, hop delay Constraints: area, power, hop delay, wire length NoC Component library IP Core models FPGA Emulation NoC Area models NoC Power models Topology Synthesis Includes: Floorplanner NoC Router SunFloor System specs: Platform Generation (xpipes- Compiler) SystemC code RTL Architectural Simulation Floorplanning specifications Synthesis Placement and Routing To fab Area, power characterization Slide 35

36 NoC Synthesis Tool chain r Requires accurate power and area models r Quickly iterate through many designs r Library of soft macros for all NoC building blocks r Floorplanner m Determine router locations m Determine link lengths (delay) Slide 36

37 NoC Network Interface Standards Standardized protocols r Plug and play with different IP blocks Bus-based semantics r Widely used Out of order transactions r Relax strict bus ordering semantics r Migrating MPSoCs from buses to NoCs. Slide 37

38 Summary Architecture r Impacts communication requirements r Coherence protocol: Broadcast vs. Directory r Shared vs. Private Caches CMP vs. MPSoC r General vs. Application specific r Custom interfaces vs. standardized interfaces Slide 38

39 Interfaces Topology Routing Flow Control Router Microarchitecture Topics to be covered Slide 39

40 Types of Topologies Slide 40

41 Types of Topologies Focus on switched topologies r Alternatives: bus and crossbar r Bus m Connects a set of components to a single shared channel m Effective broadcast medium r Crossbar m Directly connects n inputs to m outputs without intermediate stages m Fully connected, single hop network m Component of routers Slide 41

42 Types of Topologies Direct r Each router is associated with a terminal node r All routers are sources and destinations of traffic Indirect r Routers are distinct from terminal nodes r Terminal nodes can source/sink traffic r Intermediate nodes switch traffic between terminal nodes Most on-chip network use direct topologies Slide 42

43 Torus (1) K-ary n-cube: k n network nodes N-Dimensional grid with k nodes in each dimension 3-ary 2-mesh 2-cube 2,3,4-ary 3-mesh Slide 43

44 Torus (2) 1D or 2D torus map well to planar substrate for on-chip Topologies in Torus Family r Ex: Ring -- k-ary 1-cube Edge Symmetric r Good for load balancing r Removing wrap-around links for mesh loses edge symmetry m More traffic concentrated on center channels Good path diversity Exploit locality for near-neighbor traffic Slide 44

45 Torus (3) Degree = 2n, 2 channels per dimension r All nodes have same degree Total channels = 2nN r N is total number of nodes Slide 45

46 Mesh A torus with end-around connection removed Same node degree Higher demand for central channels r Load imbalance Slide 46

47 Butterfly Indirect network K-ary n-fly: k n network nodes Routing from 000 to 010 r Dest address used to directly route packet r Bit n used to select output port at stage n ary 3-fly 2 input switch, 3 stages Slide 47

48 Butterfly (2) No path diversity R =1 xy r Can add extra stages for diversity m Increase network diameter 0 1 x x x x Slide 48

49 Butterfly (3) Hop Count r Log k N + 1 r Does not exploit locality m Hop count same regardless of location Switch Degree = 2k Requires long wires to implement Slide 49

50 Clos network 3-stage networks where all input/output nodes are connected to all middle routers Key attribute: path diversity r Input node can select any middle router r Can enable non-blocking routing algorithms (5,3,4) Clos network Slide 50

51 Fat Tree Bandwidth remains constant at each level Regular Tree: Bandwidth decreases closer to root Slide 51

52 Fat Tree (2) Provides path diversity Slide 52

53 Irregular Topologies Slide 53

54 Irregular Topologies MPSoC design leverages wide variety of IP blocks r Regular topologies may not be appropriate given heterogeneity r Customized topology m Often more power efficient and deliver better performance Customize based on traffic characterization Slide 54

55 Irregular Topology Example VLD Run length decoder Inverse scan R R R VLD Run length decoder Inverse scan idct iquant AC/DC predict idct iquant AC/DC predict R R R R R up samp R ARM core VOP reconstr R VOP Memory Stripe Memory Padding R R R R up samp ARM core VOP reconstr R VOP Memory R Stripe Memory R Padding Slide 55

56 Topology Customization Merging r Start with large number of switches r Merge to adjacent routers reduce area and power Splitting r Large crossbar connecting all nodes r Iteratively split into multiple small switches m Accommodate design constraints Slide 56

57 Topology Implementation Slide 57

58 Implementation Folding r Equalize path lengths m Reduces max link length m Increases length of other links Slide 58

59 Concentration Don t need 1:1 ratio of routers to cores r Ex: 4 cores concentrated to 1 router Can save area and power Increases network complexity r r Concentrator must implement policy for sharing injection bandwidth During bursty communication m Can bottleneck Slide 59

60 Implication of Abstract Metrics on Implementation Degree: useful proxy for router complexity r Increasing ports requires additional buffer queues, requestors to allocators, ports to crossbar r All contribute to critical path delay, area and power r Link complexity does not correlate with degree m Link complexity depends on link width m Fixed number of wires, link complexity for 2-port vs 3-port is same Slide 60

61 Implications (2) Hop Count: useful proxy for overall latency and power r Does not always correlate with latency m Depends heavily on router pipeline and link propagation r Example: Hop Count says A is better than B m Network A with 2 hops, 5 stage pipeline, 4 cycle link traversal vs. But A has 18 cycle latency vs 6 cycle m Network B with 3 hops, 1 stage pipeline, 1 cycle link traversal latency for B Slide 61

62 First network design decision Topology Summary Critical impact on network latency and throughput r Hop count provides first order approximation of message latency r Bottleneck channels determine saturation throughput Slide 62

63 Routing Slide 63

64 Routing Overview Discussion of topologies assumed ideal routing In practice r Routing algorithms are not ideal Goal: distribute traffic evenly among paths r Avoid hot spots, contention r More balanced à closer throughput is to ideal Keep complexity in mind Slide 64

65 Routing Basics Once topology is fixed Routing algorithm determines path(s) from source to destination Slide 65

66 Routing Algorithm Attributes Types r Deterministic, Oblivious, Adaptive Number of destinations r Unicast, Multicast, Broadcast? Adaptivity r Oblivious or Adaptive? Local or Global knowledge? r Minimal or non-minimal? Implementation r Source or node routing? r Table or circuit? Slide 66

67 Routing Deadlock A B D C Each packet is occupying a link and waiting for a link Without routing restrictions, a resource cycle can occur r Leads to deadlock Slide 67

68 Types of Routing Algorithms Slide 68

69 Deterministic All messages from Source to Destination traverse the same path Common example: Dimension Order Routing (DOR) r Message traverses network dimension by dimension r Aka XY routing Cons: r Eliminates any path diversity provided by topology r Poor load balancing Pros: r Simple and inexpensive to implement r Deadlock-free Slide 69

70 Dimension Order Routing a.k.a X-Y Routing r Traverse network dimension by dimension r Can only turn to Y dimension after finished X Slide 70

71 Oblivious Routing decisions are made without regard to network state r Keeps algorithms simple r Unable to adapt Deterministic algorithms are a subset of oblivious Slide 71

72 Valiant s Routing Algorithm To route from s to d r Randomly choose intermediate node d r Route from s to d and from d to d. Randomizes any traffic pattern r All patterns appear uniform random r Balances network load Non-minimal Destroys locality s d d Slide 72

73 Minimal Oblivious Valiant s: Load balancing but significant increase in hop count Minimal Oblivious: some load balancing, but use shortest paths r d must lie within min quadrant r 6 options for d r Only 3 different paths s d Slide 73

74 Oblivious Routing Valiant s and Minimal Adaptive r Deadlock free m When used in conjunction with X-Y routing Randomly choose between X-Y and Y-X routes r Oblivious but not deadlock free! Slide 74

75 Exploits path diversity Adaptive Uses network state to make routing decisions r Buffer occupancies often used r Coupled with flow control mechanism Local information readily available r Global information more costly to obtain r Network state can change rapidly r Use of local information can lead to non-optimal choices Can be minimal or non-minimal Slide 75

76 Minimal Adaptive Routing d s Local info can result in sub-optimal choices Slide 76

77 Non-minimal adaptive Fully adaptive Not restricted to take shortest path Misrouting: directing packet along non-productive channel r Priority given to productive output r Some algorithms forbid U-turns Livelock potential: traversing network without ever reaching destination r Mechanism to guarantee forward progress m Limit number of misroutings Slide 77

78 Non-minimal routing example d d s Longer path with potentially lower latency s Livelock: continue routing in cycle Slide 78

79 Adaptive Routing Example Should 3 route clockwise or counterclockwise to 7? r 5 is using all the capacity of link 5 à 6 Queue at node 5 will sense contention but not at node 3 Backpressure: allows nodes to indirectly sense congestion r Queue in one node fills up, it will stop receiving flits r Previous queue will fill up If each queue holds 4 packets r 3 will send 8 packets before sensing congestion Slide 79

80 Adaptive Routing: Turn Model DOR eliminates 4 turns r N to E, N to W, S to E, S to W r No adaptivity Some adaptivity by removing 2 of 8 turns r Remains deadlock free (like DOR) West first r Eliminates S to W and N to W West first Slide 80

81 Turn Model Routing Negative first r Eliminates E to S and N to W North last r Eliminates N to E and N to W Odd-Even Negative first North last r Eliminates 2 turns depending on if current node is in odd or even col. m Even column: E to N and N to W m Odd column: E to S and S to W r Deadlock free (disallow 180 turns) r Better adaptivity Slide 81

82 Negative-First Routing Example (2,3 ) (0,3 ) (0,0 ) (2,0 ) Limited or no adaptivity for certain source-destination pairs Slide 82

83 Turn Model Routing Deadlock What about eliminating turns NW and WN? Not a valid turn elimination r Resource cycle results Slide 83

84 Adaptive Routing and Deadlock Option 1: Eliminate turns that lead to deadlock r Limits flexibility Option 2: Allow all turns r Give more flexibility r Must use other mechanism to prevent deadlock r Rely on flow control (later) m Escape virtual channels Slide 84

85 Routing Algorithm Implementation Slide 85

86 Routing Implementation Source tables r Entire route specified at source r Avoids per-hop routing latency r Unable to adapt dynamically to network conditions r Can specify multiple routes per destination m Give fault tolerance and load balance r Support reconfiguration (not specific to topology) Slide 86

87 Source Table Routing Destination Route 1 Route 2 00 X X 10 EX EX 20 EEX EEX 01 NX NX 11 NEX ENX 21 NEEX ENEX 02 NNX NNX 12 ENNX NNEX 22 EENNX NNEEX 03 NNNX NNNX 13 NENNX ENNNX 23 EENNNX NNNEEX (0,0 ) Arbitrary length paths: storage overhead and packet overhead Slide 87

88 Node Tables Store only next direction at each node Smaller tables than source routing Adds per-hop routing latency Can adapt to network conditions r Specify multiple possible outputs per destination r Select randomly to improve load balancing Slide 88

89 Node Table Routing Implements West-First Routing Each node would have 1 row of table r Max two possible output ports To From X - N - N - E - E N E N E - E N E N 01 S - 02 S - X - N - E S E - E N E S E - E N S - X - E S E S E - E S E S E - 10 W - W - W - X - N - N - E - E N E N 11 W - W - W - S - X - N - E S E - E N 12 W - W - W - S - S - X - E S E S E - 20 W - W - W - W - W - W - X - N - N - 21 W - W - W - W - W - W - S - X - N - 22 W - W - W - W - W - W - S - S - X - (1,0) Slide 89

90 Implementation Combinational circuits can be used r Simple (e.g. DOR): low router overhead r Specific to one topology and one routing algorithm m Limits fault tolerance Tables can be updated to reflect new configuration, network faults, etc Slide 90

91 Circuit Based sx x sy y =0 =0 Productive Direction Vector exit +x -x +y -y Queue lengths Route selection Selected Direction Vector exit +x -x +y -y Next hop based on buffer occupancies Or could implement simple DOR Fixed w.r.t. topology Slide 91

92 Routing Algorithms: Implementation Routing Algorithm Deterministic Source Routing Combinational Node Table DOR Yes Yes Yes Oblivious Valiant s Yes Yes Yes Minimal Yes Yes Yes Adaptive No Yes Yes Slide 92

93 Routing: Irregular Topologies MPSoCs r Power and performance benefits from irregular/custom topologies Common routing implementations r Rely on source or node table routing Maintain deadlock freedom r Turn model may not be feasible m Limited connectivity Slide 93

94 Routing Summary Latency paramount concern r Minimal routing most common for NoC r Non-minimal can avoid congestion and deliver low latency To date: NoC research favors DOR for simplicity and deadlock freedom r On-chip networks often lightly loaded Only covered unicast routing r Recent work on extending on-chip routing to support multicast Slide 94

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

ECE/CS 757: Advanced Computer Architecture II Interconnects

ECE/CS 757: Advanced Computer Architecture II Interconnects ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction

More information

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Announcements Feedback on your project proposals This week Scheduled extended 1 week Next week:

More information

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Rou1ng Overview Discussion of topologies assumed ideal rou1ng In prac1ce Rou1ng algorithms are

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

1/12/11. ECE 1749H: Interconnec3on Networks for Parallel Computer Architectures. Introduc3on. Interconnec3on Networks Introduc3on

1/12/11. ECE 1749H: Interconnec3on Networks for Parallel Computer Architectures. Introduc3on. Interconnec3on Networks Introduc3on ECE 1749H: Interconnec3on Networks for Parallel Computer Architectures Introduc3on Prof. Natalie Enright Jerger Winter 2011 ECE 1749H: Interconnec3on Networks (Enright Jerger) 1 Interconnec3on Networks

More information

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures. Introduc1on. Prof. Natalie Enright Jerger

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures. Introduc1on. Prof. Natalie Enright Jerger ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures Introduc1on Prof. Natalie Enright Jerger Winter 2011 ECE 1749H: Interconnec1on Networks (Enright Jerger) 1 Interconnec1on Networks

More information

Lecture 2: Topology - I

Lecture 2: Topology - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and

More information

TDT Appendix E Interconnection Networks

TDT Appendix E Interconnection Networks TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages

More information

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,

More information

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Interface with System Architecture. Prof. Natalie Enright Jerger

ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Interface with System Architecture. Prof. Natalie Enright Jerger ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Interface with System Architecture Prof. Natalie Enright Jerger Systems and Interfaces Look at how systems interact and interface

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Lecture 3: Topology - II

Lecture 3: Topology - II ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Interconnection Networks

Interconnection Networks Lecture 18: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of these slides were created by Michael Papamichael This lecture is partially

More information

Interconnection Networks

Interconnection Networks Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Design and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University

Design and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University Design and Test Solutions for Networks-on-Chip Jin-Ho Ahn Hoseo University Topics Introduction NoC Basics NoC-elated esearch Topics NoC Design Procedure Case Studies of eal Applications NoC-Based SoC Testing

More information

Interconnection Network

Interconnection Network Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics

More information

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013 NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

Adaptive Routing. Claudio Brunelli Adaptive Routing Institute of Digital and Computer Systems / TKT-9636

Adaptive Routing. Claudio Brunelli Adaptive Routing Institute of Digital and Computer Systems / TKT-9636 1 Adaptive Routing Adaptive Routing Basics Minimal Adaptive Routing Fully Adaptive Routing Load-Balanced Adaptive Routing Search-Based Routing Case Study: Adapted Routing in the Thinking Machines CM-5

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation

More information

Understanding the Routing Requirements for FPGA Array Computing Platform. Hayden So EE228a Project Presentation Dec 2 nd, 2003

Understanding the Routing Requirements for FPGA Array Computing Platform. Hayden So EE228a Project Presentation Dec 2 nd, 2003 Understanding the Routing Requirements for FPGA Array Computing Platform Hayden So EE228a Project Presentation Dec 2 nd, 2003 What is FPGA Array Computing? Aka: Reconfigurable Computing Aka: Spatial computing,

More information

Basic Switch Organization

Basic Switch Organization NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization

More information

Message Passing Models and Multicomputer distributed system LECTURE 7

Message Passing Models and Multicomputer distributed system LECTURE 7 Message Passing Models and Multicomputer distributed system LECTURE 7 DR SAMMAN H AMEEN 1 Node Node Node Node Node Node Message-passing direct network interconnection Node Node Node Node Node Node PAGE

More information

Chapter 9 Multiprocessors

Chapter 9 Multiprocessors ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

Communication Performance in Network-on-Chips

Communication Performance in Network-on-Chips Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In

More information

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts

ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running

More information

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality

More information

3D WiNoC Architectures

3D WiNoC Architectures Interconnect Enhances Architecture: Evolution of Wireless NoC from Planar to 3D 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan Sep 18th, 2014 Hiroki Matsutani, "3D WiNoC Architectures",

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD

More information

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N. Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

Network on Chip Architecture: An Overview

Network on Chip Architecture: An Overview Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology

More information

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed

More information

INTERCONNECTION NETWORKS LECTURE 4

INTERCONNECTION NETWORKS LECTURE 4 INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue

More information

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms Outline Networks: Routing and Design Routing Switch Design Case Studies CS 5, Spring 99 David E. Culler Computer Science Division U.C. Berkeley 3/3/99 CS5 S99 Routing Recall: routing algorithm determines

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 21 Routing Outline Routing Switch Design Flow Control Case Studies Routing Routing algorithm determines which of the possible paths are used as routes how

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer

More information

Chapter 7 Slicing and Dicing

Chapter 7 Slicing and Dicing 1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes

More information

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly

More information

CS575 Parallel Processing

CS575 Parallel Processing CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom

Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom ISCA 2018 Session 8B: Interconnection Networks Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom Aniruddh Ramrakhyani Georgia Tech (aniruddh@gatech.edu) Tushar

More information

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011 CS252 Graduate Computer Architecture Lecture 14 Multiprocessor Networks March 9 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

CMSC 611: Advanced. Interconnection Networks

CMSC 611: Advanced. Interconnection Networks CMSC 611: Advanced Computer Architecture Interconnection Networks Interconnection Networks Massively parallel processor networks (MPP) Thousands of nodes Short distance (

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

Fast Flexible FPGA-Tuned Networks-on-Chip

Fast Flexible FPGA-Tuned Networks-on-Chip This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Fast Flexible FPGA-Tuned Networks-on-Chip Michael K. Papamichael, James C. Hoe

More information

Interconnection Networks

Interconnection Networks Lecture 15: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Credit: some slides created by Michael Papamichael, others based on slides from Onur Mutlu

More information

The Impact of Optics on HPC System Interconnects

The Impact of Optics on HPC System Interconnects The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes

More information

Part IV: 3D WiNoC Architectures

Part IV: 3D WiNoC Architectures Wireless NoC as Interconnection Backbone for Multicore Chips: Promises, Challenges, and Recent Developments Part IV: 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan 1 Outline: 3D WiNoC Architectures

More information

Network-on-Chip Architecture

Network-on-Chip Architecture Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)

More information

Lecture 22: Router Design

Lecture 22: Router Design Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO 03, Princeton A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip

More information

Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs

Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs

More information

Low-Power Interconnection Networks

Low-Power Interconnection Networks Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope

More information

Physical Organization of Parallel Platforms. Alexandre David

Physical Organization of Parallel Platforms. Alexandre David Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:

More information

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology Outline SoC Interconnect NoC Introduction NoC layers Typical NoC Router NoC Issues Switching

More information

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 12: On-Chip Interconnects

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 12: On-Chip Interconnects 1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 12: On-Chip Interconnects Instructor: Ron Dreslinski Winter 216 1 1 Announcements Upcoming lecture schedule Today: On-chip

More information

Interconnection networks

Interconnection networks Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory

More information

Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems

Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems 1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

Topology basics. Constraints and measures. Butterfly networks.

Topology basics. Constraints and measures. Butterfly networks. EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April

More information

Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip. Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH

Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip. Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH Overlaid Mesh Topology Design and Deadlock Free Routing in Wireless Network-on-Chip Danella Zhao and Ruizhe Wu Presented by Zhonghai Lu, KTH Outline Introduction Overview of WiNoC system architecture Overlaid

More information

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS 1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

More information

Multicomputer distributed system LECTURE 8

Multicomputer distributed system LECTURE 8 Multicomputer distributed system LECTURE 8 DR. SAMMAN H. AMEEN 1 Wide area network (WAN); A WAN connects a large number of computers that are spread over large geographic distances. It can span sites in

More information

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Bhavya K. Daya, Li-Shiuan Peh, Anantha P. Chandrakasan Dept. of Electrical Engineering and Computer

More information

Address InterLeaving for Low- Cost NoCs

Address InterLeaving for Low- Cost NoCs Address InterLeaving for Low- Cost NoCs Miltos D. Grammatikakis, Kyprianos Papadimitriou, Polydoros Petrakis, Marcello Coppola, and Michael Soulie Technological Educational Institute of Crete, GR STMicroelectronics,

More information

Parallel Computing Platforms

Parallel Computing Platforms Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics

More information

ES1 An Introduction to On-chip Networks

ES1 An Introduction to On-chip Networks December 17th, 2015 ES1 An Introduction to On-chip Networks Davide Zoni PhD mail: davide.zoni@polimi.it webpage: home.dei.polimi.it/zoni Sources Main Reference Book (for the examination) Designing Network-on-Chip

More information

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik SoC Design Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik Chapter 5 On-Chip Communication Outline 1. Introduction 2. Shared media 3. Switched media 4. Network on

More information

EC 513 Computer Architecture

EC 513 Computer Architecture EC 513 Computer Architecture On-chip Networking Prof. Michel A. Kinsy Virtual Channel Router VC 0 Routing Computation Virtual Channel Allocator Switch Allocator Input Ports VC x VC 0 VC x It s a system

More information

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture

More information

Local Area Networks (LANs): Packets, Frames and Technologies Gail Hopkins. Part 3: Packet Switching and. Network Technologies.

Local Area Networks (LANs): Packets, Frames and Technologies Gail Hopkins. Part 3: Packet Switching and. Network Technologies. Part 3: Packet Switching and Gail Hopkins Local Area Networks (LANs): Packets, Frames and Technologies Gail Hopkins Introduction Circuit Switching vs. Packet Switching LANs and shared media Star, bus and

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers

Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California

More information

Fault-tolerant & Adaptive Stochastic Routing Algorithm. for Network-on-Chip. Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie

Fault-tolerant & Adaptive Stochastic Routing Algorithm. for Network-on-Chip. Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie Fault-tolerant & Adaptive Stochastic Routing Algorithm for Network-on-Chip Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie Idea & solution to be investigated by the project There are some options

More information

Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands MSc THESIS

Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands  MSc THESIS Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands http://ce.et.tudelft.nl/ 2014 MSc THESIS NoC characterization framework for design space exploration Sriram Prakash Adiga Abstract A Network

More information

Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals

Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals Philipp Gorski, Tim Wegner, Dirk Timmermann University

More information

Chapter 3 : Topology basics

Chapter 3 : Topology basics 1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000 2 Network topology (1) It corresponds to the static arrangement

More information