Applica'on Aware Deadlock Free Oblivious Rou'ng
|
|
- Geoffrey Barrett
- 5 years ago
- Views:
Transcription
1 Applica'on Aware Deadlock Free Oblivious Rou'ng Michel Kinsy, Myong Hyo Cho, Tina Wen, Edward Suh (Cornell University), Marten van Dijk and Srinivas Devadas Massachuse(s Ins,tute of Technology
2 Outline Rou,ng algorithms and Mo,va,on Applica,on Aware Oblivious Rou,ng Bandwidth Sensi,ve Rou,ng Approach Router Architecture and Performance Analysis Plans for the Future
3 Oblivious Rou'ng Sta,cally determined given the source and des,na,on addresses XY Rou'ng Dc DB DA SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand
4 Oblivious Rou'ng Sta,cally determined given the source and des,na,on addresses XY Rou'ng (+) Simple and fast router designs Dc DB DA SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand
5 Oblivious Rou'ng Sta,cally determined given the source and des,na,on addresses XY Rou'ng (+) Simple and fast router designs ( ) Lead to network under u,liza,on ( ) Lack proper load balancing SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand Dc DB DA
6 Adap've Rou'ng Routes dynamically adjusted based on network status Dc DB DA SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand
7 Adap've Rou'ng Routes dynamically adjusted based on network status (+) Be(er load balancing and path diversity (+) Poten,ally be(er throughput and latency Dc DB DA SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand
8 Adap've Rou'ng Routes dynamically adjusted based on network status (+) Be(er load balancing and path diversity (+) Poten,ally be(er throughput and latency Dc DB ( ) Need for global or local knowledge of network condi,ons ( ) Router complexity SA SB SC Link Capacity 50 Mbytes/sec Each flow has 25 Mbytes/sec bandwidth demand DA
9 Mo'va'on Can we get the best of both worlds? (+) Simple and fast router designs (+) Be(er load balancing (+) Poten,ally be(er throughput and latency
10 Mo'va'on Given an applica,on, with knowledge of data communica,on pa(erns, can we determine a set of sta,c routes that performs be(er than conven,onal oblivious rou,ng? Exploit knowledge of bandwidth demands (or latency requirements) Ensure deadlock freedom
11 PlaMorms and Suitable applica'ons Suitable for applica,ons with predictable communica,on pa(erns video compression processor simula,on rendering Reconfigurable hardware: processing elements and their interconnec,on network can be configured
12 Outline Rou,ng algorithms and Mo,va,on Applica,on Aware Oblivious Rou,ng Bandwidth Sensi,ve Rou,ng Approach Router Architecture and Performance Analysis Plans for the Future
13 Applica'on Aware Rou'ng Framework Step 1: Use the targeted network topology and resources (e.g., buffer space) to create a conven;onal channel dependency graph (CDG) D of the network. F E D A B the mesh is transformed into a CDG C FE ED DC CB BE EF FA AB BC CD DE EB BA AF
14 Applica'on Aware Rou'ng Framework Step 1: Use the targeted network topology and resources (e.g., buffer space) to create a conven;onal channel dependency graph (CDG) D of the network. Ver,ces in the CDG represent network links F A E B D C the mesh is transformed into a CDG FE ED DC CB BE EF FA AB BC CD DE EB BA AF
15 Applica'on Aware Rou'ng Framework Step 2: Create (new) acyclic CDG D A by dele;ng some edges from D. Because the channel dependency graph D derived from the network topology may contain many cycles FE ED DC CB BE EF FA AB BC CD DE EB BA AF Well known result: Having cycle free dependency graph ensures deadlock freedom
16 Turn Model (Glass and Ni, 1994) A systema,c way of genera,ng deadlock free routes with small number of prohibited turns Deadlock free if routes conform to at least ONE of the turn models (acyclic channel dependence graph) West First Turn Model North Last Turn Model
17 Acyclic CDG Deadlock free routes Per the North-Last prohibited turns, all the edges in red are deleted F A E B D C FE ED DC CB BE EF FA AB BC CD DE EB BA AF North Last Acyclic CDG AB FE ED DC CB BE EF FA BC CD DE EB BA AF
18 Acyclic CDG Deadlock free routes Turns could be prohibited at ad-hoc, all the edges in red are deleted F A E B D C FE ED DC CB BE EF FA AB BC CD DE EB BA AF Ad hoc Acyclic CDG FE EB AB BE BC ED CD DC DE CB EF BA FA AF
19 Acyclic CDG Deadlock free routes Turns could be prohibited at ad-hoc, all the edges in red are deleted. F A E B D C FE ED DC CB BE EF FA AB BC 5 edges are CD deleted here vs. DE EB BA 4 edges in the North last AF Ad hoc Acyclic CDG FE EB AB BE BC ED CD DC DE CB EF BA FA AF
20 Applica'on Aware Rou'ng Framework Step 3: Transform D A into a flow network G A, given a set of k flows denoted K. Flows K = {K 1, K 2,, K k }. K i = (s i, t i, d i ), where s i and t i are the source and sink, for connec,on i, and d i is the demand
21 Applica'on Aware Rou'ng Framework Step 3: Transform D A into a flow network G A, given a set of k flows denoted K. Part of the modular decomposi,on of the H.264 decoder, with the following es,mated bandwidths and placement: K 1 : 39.7 MB/s Inverse transform/ Quan,za,on Module K 2 : 39.7 MB/s Video bitstream Entropy Decoder Module Other modules Intra Predic,on/ Deblocking Reconstruc,on Module Flow ID Source Des,na,on Demands K 1 F B 39.7 MB/s K 2 B D 39.7 MB/s
22 Applica'on Aware Rou'ng Framework Step 3: Transform D A into a flow network G A, given a set of k flows denoted K. Flows are routed on CDG not on network To rou,ng K 1 = ( F, B, 39.7 MB/s) on the ad hoc acyclic CDG "$ $% %! (!!" "# #& &$ ' $ %$ $" $& &# #" "!!% Dummy nodes s F and d B are created to drive flow K 1 from its source F and to sink it into B
23 Applica'on Aware Rou'ng Framework Step 3: Transform D A into a flow network G A, given a set of k flows denoted K. Flows are routed on CDG not on network To rou,ng K 2 = ( B, D, 39.7 MB/s) on the ad hoc acyclic CDG %$ $& &# #" "!!% $" ( ' # $ "#!" "$ #& &$ $% %! Dummy nodes s B and d D are created to drive flow K 2 from its source B and to sink it into D
24 Applica'on Aware Rou'ng Framework Step 3: Transform D A into a flow network G A, given a set of k flows denoted K. Flows are routed on CDG not on network To rou,ng K 2 = ( B, D, 39.7 MB/s) on the ad hoc acyclic CDG (' ') )# #" "!!( '" + * "# # '!" "' #) )' '( (! Edges into BE are assigned the capacity of link BE in the mesh No capacity or weight is assigned to the edges incident on sink nodes
25 Applica'on Aware Rou'ng Framework Step 4: Perform applica;on aware rou;ng of the flows in G A. Step 5: If desired, go to Step 2 and repeat using a different acyclic CDG. Step 6: Select the best set of routes found, per the rou;ng func;on used in Step 4. In Step 4, bandwidth sensi,ve rou,ng can be used as a type of applica,on aware rou,ng scheme
26 Outline Rou,ng algorithms and Mo,va,on Applica,on Aware Oblivious Rou,ng Bandwidth Sensi,ve Rou,ng Approach Router Architecture and Performance Analysis Plans for the Future
27 Bandwidth Sensi've Oblivious Rou'ng (BSOR) Goal: Route flows while minimizing the maximum channel load (MCL) U in the network: %$ $& minimize &# U = max #" $" ( ' # $ "#!" "$ #& v E k i=1 "! (u,v) E f i (u,v) v is a link/vertex in the CDG (e.g., ED)!% f i (u, v) is the edge s bandwidth used by flow i (e.g., f 1 (BE, ED) = 0 where &$ f 2 (BE, ED) = 39.7) $% %!
28 Bandwidth Sensi've Oblivious Rou'ng (BSOR) Goal: Route flows while minimizing the maximum channel load (MCL) U in the network: %$ $& minimize &# U = max #" $" ( ' # $ "#!" "$ #& v E k i=1 v is a link/vertex in the CDG (e.g., ED) "! (u,v) E f i (u,v)!% f i (u, v) is the edge s bandwidth used by flow i (e.g., f 1 (BE, ED) = 0 where &$ f 2 (BE, ED) = 39.7) $% U denotes the channel with the highest load which is the bo(leneck channel in the en,re network and determines the satura,on throughput of the system %!
29 BSOR Algorithms Unspli(able flow problem is NP hard Mixed Integer Linear Programming (MILP) can provide an op,mal solu,on in worst case exponen,al,me Works for small problems with ~100 flows Dijkstra s weighted shortest path algorithm provides a polynomial,me heuris,c that produces good results
30 Mixed Integer Linear Programming Capacity: Flow conserva,on: k v s i,t i h(v) = f i (u,v) c(u,v) i=1 (u,v) E i, u s i,t i f i (w,u) = f i (u,w) (w,u) E (u,w) E Unspli(able flow: i, f i (s i,w) = f i (w,t i ) = g i (s i,w) E (w,t i ) E Hop count: helps control i b i (u,v) hop i (u,v ) E path lengths i, (u,v) E, f i (u,v) b i (u,v) d i i, u (u,v ) E b i (u,v) 1
31 Mixed Integer Linear Programming Capacity: Flow conserva,on: i, u s i,t i f i (w,u) = f i (u,w) (w,u) E k v s i,t i h(v) = f i (u,v) c(u,v) i=1 (u,v) E (u,w) E Unspli(able flow: i, f i (s i,w) = f i (w,t i ) = g i (s i,w) E (w,t i ) E Hop count: helps control i b i (u,v) hop i (u,v ) E path lengths i, (u,v) E, f i (u,v) b i (u,v) d i i, u (u,v ) E b i (u,v) 1
32 Dijkstra s weighted shortest path BSOR Polynomial,me (suitable for large size problems) Greedily route one flow at the,me The weigh,ng func,on: Residual capacity:
33 Dijkstra based Flows rou'ng Illustra'on For flow K 1 (F, B, 39.7 MB/s) "' '( (! F E D +!!" "# #) )' * ' A B C (' '" ') )# #" "!!( We consider all possible routes for flow K 1 conforming to the acyclic graph Here the ad hoc acyclic CDG is used
34 Dijkstra based Flows rou'ng Illustra'on For flow K 1 (F, B, 39.7 MB/s) "' '( (! F E D +!!" "# #) )' * ' A B C (' '" ') )# #" "!!( We consider all possible routes for flow K 1 conforming to the acyclic graph Final route corresponds to the best path in CDG determined by the weigh,ng func,on
35 Dijkstra based Flows rou'ng Illustra'on For flow K 2 (B, D) 39.7 MB/s F E D (' ') )# #" "!!( A B C '" + * "# # '!" "' #) )' '( (! Arer the edge weights are adjusted from the rou,ng of K 1, we consider all possible routes for K 2 conforming to the acyclic graph
36 Dijkstra based Flows rou'ng Illustra'on For flow K 2 (B, D) 39.7 MB/s F E D (' ') )# #" "!!( A B C '" + * "# # '!" "' #) )' '( (! Final route corresponds to the best path in CDG determined by the weigh,ng func,on Here both routes permi(ed under the acyclic CDG have the same weight
37 Dijkstra based Flows rou'ng Illustra'on F E D A B C Final routes for the two flows are as shown Rou,ng order of flows in Dijkstra based algorithms does affect route selec,on MILP produces the minimal MCL through exhaus,ve search
38 Comparison of Maximum Channel Load Transpose: Bit Complement: Shuffle: H.264 decoder: bandwidths and flows derived through profiling Traffic XY YX ROMM Valiant Dijkstra MILP Transpose Bit comp Shuffle H
39 Outline Rou,ng algorithms and Mo,va,on Applica,on Aware Oblivious Rou,ng Bandwidth Sensi,ve Rou,ng Approach Router Architecture and Performance Analysis Plans for the Future
40 Baseline architecture Router architecture VC Allocator VC state crossbar switch Rou,ng Module Switch Allocator VC state Output Port Output Port Router: rou,ng phases Rou,ng Computa,on (RC) Virtual Channel Alloca,on (VA) Switch Alloca,on (SA) Switch Traversal (ST) PE L R PE H R PE G R PE E R PE F R PE D R PE B R PE A R PE C R 3x3 Mesh network Typical virtual channel router
41 Router for Applica'on Aware Rou'ng Rou,ng Module VC Allocator Switch Allocator Rou,ng Computa,on (RC) VC state Output Port Virtual Channel Alloca,on (VA) Switch Alloca,on (SA) VC state crossbar switch Output Port Switch Traversal (ST) Router architecture Router: rou,ng phases We need modifica,ons to the standard router architecture for applica,on aware rou,ng The main change required is in the rou,ng module The rou,ng module needs table based rou;ng
42 Two ways of Table Based Rou'ng C packet A B N N E N N RC RC RC N packet C A B 1 out index 2 0 N 3 1 E 2 2 W 5 out index 0 N 3 1 E 5 2 W 1 0 out index 0 E 1 1 S 6 2 N 0 Source rou,ng Node table rou,ng Source rou,ng eliminates the rou,ng step, but results in longer packets Node table rou,ng Each module contains a rou,ng table, which is looked up at every hop but this does not change the per hop latency
43 Performance Analysis Benchmarks : Synthe,c: Transpose, Bit Complement, and Shuffle Applica,on: H.264 Decoder Simulator : a cycle accurate network simulator 8 X 8 2 D mesh network with 1, 2, 4 or 8 VCs per port Fixed packet length : 8 flits Per hop latency : 1 cycle Flit buffer size per VC : 16 flits Simula,on for 100,000 cycles arer 20,000 cycles of warm up
44
45
46 In H.264 head of blocking is the limi,ng factor for BSOR More VCs to mi,gate the effects We also propose a different heuris,c BSORM (Bandwidth Sensi,ve Oblivious Rou,ng with Minimal Routes) which requires two virtual channels [NOCS 09]
47 "'% "'$ "'# "'" " &'% &'$ &'# &'" &!"#$!"#$'()*"+&"'&,# DC- EF(5- CD- DC- EF(5-!'%! " # $ % &! &" ())*+*,-./0*1234/-562*-78619*2:;1<1=*>!'%! " # $ % &! &" Bandwidth of each Individual flow is changed by 10% and 50% in a random fashion?426=-?@+4ab@8a *2:;1<1=*> ROMM and Valiant do not do as well as DOR algorithms Complete summary of our experimental results in the paper "'% "'$ "'# "'" " &'% &'$ &'# &'" &!"#$!"#$'()*"+&"'&,# CD- ())*+*,-./0*1234/-562*-78619*2:;1<1=*>
48 Conclusion Our applica,on aware framework has same router speed and complexity as required by other oblivious algorithms It does be(er load balancing, (therefore shows be(er performance) than other oblivious algorithms It can handle even substan,al run,me bandwidth varia,on with no significant performance degrada,on
49 Conclusion Its limitation: need some knowledge of the application To handle bursty flows, we have proposed bandwidth-adaptive networks that contain adaptive bidirectional links [PACT 09] Ongoing work: - How does bandwidth-sensitive routing do on the bandwidth-adaptive network?
50 Q & A Sec'on Thank You! More Informa,on at h(p://csg.csail.mit.edu Also thanks to Keun Sup Shim and Mieszko Lis for their contribu,on, comments and valuable feedback
EC 513 Computer Architecture
EC 513 Computer Architecture On-chip Networking Prof. Michel A. Kinsy Virtual Channel Router VC 0 Routing Computation Virtual Channel Allocator Switch Allocator Input Ports VC x VC 0 VC x It s a system
More informationin Oblivious Routing
Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim, Myong Hyon Cho, Michel Kinsy, Tina Wen, Mieszko Lis G. Edward Suh (Cornell) Srinivas Devadas MIT Computer Science and Artificial Intelligence
More informationApplication-Aware Deadlock-Free Oblivious Routing
Application-Aware Deadlock-Free Oblivious Routing Michel Kinsy, Myong Hyon Cho, Tina Wen, Edward Suh, Marten van Dijk, Srinivas Devadas Computer Science and Artificial Intelligence Laboratory Department
More informationStatic Virtual Channel Allocation in Oblivious Routing
Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim Myong Hyon Cho Michel Kinsy Tina Wen Mieszko Lis G. Edward Suh Srinivas Devadas Computer Science and Artificial Intelligence Laboratory
More informationECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger
ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Rou1ng Overview Discussion of topologies assumed ideal rou1ng In prac1ce Rou1ng algorithms are
More informationECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger
ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Announcements Feedback on your project proposals This week Scheduled extended 1 week Next week:
More informationLink State Rou.ng Reading: Sec.ons 4.2 and 4.3.4
Link State Rou.ng Reading: Sec.ons. and.. COS 6: Computer Networks Spring 009 (MW :0 :50 in COS 05) Michael Freedman Teaching Assistants: WyaN Lloyd and Jeff Terrace hnp://www.cs.princeton.edu/courses/archive/spring09/cos6/
More informationPath-Diverse Inorder Routing
Path-Diverse Inorder Routing Mieszko Lis, Myong Hyon Cho, Keun Sup Shim, and Srinivas Devadas Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology {mieszko, mhcho,
More informationSEDA An architecture for Well Condi6oned, scalable Internet Services
SEDA An architecture for Well Condi6oned, scalable Internet Services Ma= Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium on Operating Systems Principles (SOSP), October
More informationOblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim, Michel Kinsy, Tina Wen, and Srinivas Devadas
Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-009-0 March 7, 009 Oblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim,
More informationLink State Rou.ng Reading: Sec.ons 4.2 and 4.3.4
Link State Rou.ng Reading: Sec.ons. and.. COS 6: Computer Networks Spring 0 Mike Freedman hep://www.cs.princeton.edu/courses/archive/spring/cos6/ Inside a router Goals of Today s Lecture Control plane:
More informationCIS-331 Fall 2013 Exam 1 Name: Total of 120 Points Version 1
Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. NOTE: Router E should only be used for Internet traffic. Router A Router
More informationBasic Switch Organization
NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization
More informationWireless Mul*hop Ad Hoc Networks
Wireless Mul*hop Guevara Noubir noubir@ccs.neu.edu Some slides are from Nitin Vaidya s tutorial. Infrastructure vs. Ad Hoc Wireless Networks Infrastructure networks: One or several Access- Points (AP)
More informationLecture 22: Router Design
Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO 03, Princeton A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationOblivious Routing in On-Chip Bandwidth-Adaptive Networks
009 8th International Conference on Parallel Architectures and Compilation Techniques Oblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim, Michel Kinsy,
More informationA Thermal-aware Application specific Routing Algorithm for Network-on-chip Design
A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design Zhi-Liang Qian and Chi-Ying Tsui VLSI Research Laboratory Department of Electronic and Computer Engineering The Hong Kong
More informationCIS-331 Spring 2016 Exam 1 Name: Total of 109 Points Version 1
Version 1 Instructions Write your name on the exam paper. Write your name and version number on the top of the yellow paper. Answer Question 1 on the exam paper. Answer Questions 2-4 on the yellow paper.
More informationThomas Moscibroda Microsoft Research. Onur Mutlu CMU
Thomas Moscibroda Microsoft Research Onur Mutlu CMU CPU+L1 CPU+L1 CPU+L1 CPU+L1 Multi-core Chip Cache -Bank Cache -Bank Cache -Bank Cache -Bank CPU+L1 CPU+L1 CPU+L1 CPU+L1 Accelerator, etc Cache -Bank
More informationRandomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks
2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam
More informationPhastlane: A Rapid Transit Optical Routing Network
Phastlane: A Rapid Transit Optical Routing Network Mark Cianchetti, Joseph Kerekes, and David Albonesi Computer Systems Laboratory Cornell University The Interconnect Bottleneck Future processors: tens
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationBandwidth Aware Routing Algorithms for Networks-on-Chip
1 Bandwidth Aware Routing Algorithms for Networks-on-Chip G. Longo a, S. Signorino a, M. Palesi a,, R. Holsmark b, S. Kumar b, and V. Catania a a Department of Computer Science and Telecommunications Engineering
More informationCIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1
Version 1 1. (20 Points) Given the class A network address 117.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 4,000 subnets? b. (5 Points) What is
More informationRouteBricks: Exploi2ng Parallelism to Scale So9ware Routers
RouteBricks: Exploi2ng Parallelism to Scale So9ware Routers Mihai Dobrescu and etc. SOSP 2009 Presented by Shuyi Chen Mo2va2on Router design Performance Extensibility They are compe2ng goals Hardware approach
More informationLecture 23: Router Design
Lecture 23: Router Design Papers: A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks, ISCA 06, Penn-State ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip
More informationSynchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom
ISCA 2018 Session 8B: Interconnection Networks Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom Aniruddh Ramrakhyani Georgia Tech (aniruddh@gatech.edu) Tushar
More informationCIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1
Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. Router A Router B Router C Router D Network Next Hop Next Hop Next Hop Next
More informationHORNET User Manual. Copyright 2011 MIT
HORNET User Manual Copyright 2011 MIT LIST OF FIGURES CONTENTS Contents 1 Introduction 1 2 Getting Started 1 2.1 Before Installation............................................ 1 2.2 HORNET Design Overview......................................
More informationGlobal Adaptive Routing Algorithm Without Additional Congestion Propagation Network
1 Global Adaptive Routing Algorithm Without Additional Congestion ropagation Network Shaoli Liu, Yunji Chen, Tianshi Chen, Ling Li, Chao Lu Institute of Computing Technology, Chinese Academy of Sciences
More informationPrivacy-Preserving Shortest Path Computa6on
Privacy-Preserving Shortest Path Computa6on David J. Wu, Joe Zimmerman, Jérémy Planul, and John C. Mitchell Stanford University Naviga6on desired des@na@on current posi@on Naviga6on: A Solved Problem?
More informationJoint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals
Joint consideration of performance, reliability and fault tolerance in regular Networks-on-Chip via multiple spatially-independent interface terminals Philipp Gorski, Tim Wegner, Dirk Timmermann University
More informationPDA-HyPAR: Path-Diversity-Aware Hybrid Planar Adaptive Routing Algorithm for 3D NoCs
PDA-HyPAR: Path-Diversity-Aware Hybrid Planar Adaptive Routing Algorithm for 3D NoCs Jindun Dai *1,2, Renjie Li 2, Xin Jiang 3, Takahiro Watanabe 2 1 Department of Electrical Engineering, Shanghai Jiao
More informationA Layer-Multiplexed 3D On-Chip Network Architecture Rohit Sunkam Ramanujam and Bill Lin
50 IEEE EMBEDDED SYSTEMS LETTERS, VOL. 1, NO. 2, AUGUST 2009 A Layer-Multiplexed 3D On-Chip Network Architecture Rohit Sunkam Ramanujam and Bill Lin Abstract Programmable many-core processors are poised
More informationOpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel
OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) hyoukjun@gatech.edu April
More informationA Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing
727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni
More informationDeadlock-free XY-YX router for on-chip interconnection network
LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ
More informationEECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal
Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,
More informationParallel Hardware and Interconnects
Parallel Hardware and Interconnects Sec1on 2.3 Bryan Mills, PhD Spring 2017 Summary Pthreads Simple threading (ie strassen) Shared Data and Cri1cal Sec1ons Busy- Wait Locks (mutex) Semaphore Read/Write
More informationPrediction Router: Yet another low-latency on-chip router architecture
Prediction Router: Yet another low-latency on-chip router architecture Hiroki Matsutani Michihiro Koibuchi Hideharu Amano Tsutomu Yoshinaga (Keio Univ., Japan) (NII, Japan) (Keio Univ., Japan) (UEC, Japan)
More informationDesign of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture
Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and
More informationOrigin- des*na*on Flow Measurement in High- Speed Networks
IEEE INFOCOM, 2012 Origin- des*na*on Flow Measurement in High- Speed Networks Tao Li Shigang Chen Yan Qiao Introduc*on (Defini*ons) Origin- des+na+on flow between two routers is the set of packets that
More informationWhere we are in the Course
Where we are in the ourse More fun in the Network Layer! We ve covered packet forwarding Now we ll learn about roung Applicaon Transport Network Link Physical SE 61 University of Washington 1 Roung versus
More informationDynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers
Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California
More informationArchitecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC
BWCCA 2010 Fukuoka, Japan November 4-6 2010 Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu
More informationNOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms
Outline Networks: Routing and Design Routing Switch Design Case Studies CS 5, Spring 99 David E. Culler Computer Science Division U.C. Berkeley 3/3/99 CS5 S99 Routing Recall: routing algorithm determines
More informationNetwork layer overview
Network layer overview understand principles behind layer services: layer service models forwarding versus rou:ng how a router works rou:ng (path selec:on) broadcast, mul:cast instan:a:on, implementa:on
More informationHypergraph Sparsifica/on and Its Applica/on to Par//oning
Hypergraph Sparsifica/on and Its Applica/on to Par//oning Mehmet Deveci 1,3, Kamer Kaya 1, Ümit V. Çatalyürek 1,2 1 Dept. of Biomedical Informa/cs, The Ohio State University 2 Dept. of Electrical & Computer
More informationLecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID
Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1 Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual
More informationSTG-NoC: A Tool for Generating Energy Optimized Custom Built NoC Topology
STG-NoC: A Tool for Generating Energy Optimized Custom Built NoC Topology Surbhi Jain Naveen Choudhary Dharm Singh ABSTRACT Network on Chip (NoC) has emerged as a viable solution to the complex communication
More informationLink Layer. w/ much credit to Cisco CCNA and Rick Graziani (Cabrillo)
Link Layer w/ much credit to Cisco CCNA and Rick Graziani (Cabrillo) Administra>via How are the labs going? Telnet- ing into Linux as root In /etc/pam.d/remote comment out line auth required pam_securely.so
More informationCIS-331 Exam 2 Spring 2016 Total of 110 Points Version 1
Version 1 1. (20 Points) Given the class A network address 121.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 8,100 subnets? b. (5 Points) What is
More informationSpanning Tree and Datacenters
Spanning Tree and Datacenters EE 122, Fall 2013 Sylvia Ratnasamy http://inst.eecs.berkeley.edu/~ee122/ Material thanks to Mike Freedman, Scott Shenker, Ion Stoica, Jennifer Rexford, and many other colleagues
More informationERA: An Efficient Routing Algorithm for Power, Throughput and Latency in Network-on-Chips
: An Efficient Routing Algorithm for Power, Throughput and Latency in Network-on-Chips Varsha Sharma, Rekha Agarwal Manoj S. Gaur, Vijay Laxmi, and Vineetha V. Computer Engineering Department, Malaviya
More informationKey Nego(a(on Protocol & Trust Router
Key Nego(a(on Protocol & Trust Router dra6- howle:- radsec- knp ABFAB, IETF 80 31 March, Prague. Introduc(on The ABFAB architecture does not require any par(cular AAA strategy for connec(ng RPs to IdPs.
More informationCellular Networks and Mobile Compu5ng COMS , Spring 2012
Cellular Networks and Mobile Compu5ng COMS 6998-8, Spring 2012 Instructor: Li Erran Li (lierranli@cs.columbia.edu) hkp://www.cs.columbia.edu/~coms6998-8/ 3/26/2012: Cellular Network and Traffic Characteriza5on
More informationECE 669 Parallel Computer Architecture
ECE 669 Parallel Computer Architecture Lecture 21 Routing Outline Routing Switch Design Flow Control Case Studies Routing Routing algorithm determines which of the possible paths are used as routes how
More informationPRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS. A Thesis SONALI MAHAPATRA
PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS A Thesis by SONALI MAHAPATRA Submitted to the Office of Graduate and Professional Studies of Texas A&M University
More informationDiastolic Arrays: Throughput-Driven Reconfigurable Computing
Diastolic Arrays: Throughput-Driven Reconfigurable Computing Myong Hyon Cho, Chih-Chi Cheng, Michel Kinsy, G. Edward Suh and Srinivas Devadas Massachusetts Institute of Technology, National Taiwan University,
More information: Advanced Compiler Design. 8.0 Instruc?on scheduling
6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.
More informationInterconnection Networks
Interconnection Networks Interconnection Networks Introduction How to connect individual devices together into a group of communicating devices? Device: r r r Component within a computer Single computer
More informationOpenSMART: An Opensource Singlecycle Multi-hop NoC Generator
OpenSMART: An Opensource Singlecycle Multi-hop NoC Generator Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) OpenSMART (https://tinyurl.com/get-opensmart)
More informationCONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION. Stephen Chui Bachelor of Engineering Ryerson University, 2012.
CONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION by Stephen Chui Bachelor of Engineering Ryerson University, 2012 A thesis presented to Ryerson University in partial fulfillment of the
More informationNetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013
NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching
More informationFault-adaptive routing
Fault-adaptive routing Presenter: Zaheer Ahmed Supervisor: Adan Kohler Reviewers: Prof. Dr. M. Radetzki Prof. Dr. H.-J. Wunderlich Date: 30-June-2008 7/2/2009 Agenda Motivation Fundamentals of Routing
More informationNOC Deadlock and Livelock
NOC Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationEfficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip
ASP-DAC 2010 20 Jan 2010 Session 6C Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip Jonas Diemer, Rolf Ernst TU Braunschweig, Germany diemer@ida.ing.tu-bs.de Michael Kauschke Intel,
More informationA MAC protocol for Reliable Broadcast Communica7ons in Wireless Network- on- Chip
A MAC protocol for Reliable Broadcast Communica7ons in Wireless Network- on- Chip Sergi Abadal (abadal@ac.upc.edu) Albert Mestres, Josep Torrellas, Eduard Alarcón, and Albert Cabellos- Aparicio UPC and
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationDynamic Routing of Hierarchical On Chip Network Traffic
System Level Interconnect Prediction 2014 Dynamic outing of Hierarchical On Chip Network Traffic Julian Kemmerer jvk@drexel.edu Baris Taskin taskin@coe.drexel.edu Electrical & Computer Engineering Drexel
More informationDeadlock and Livelock. Maurizio Palesi
Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationAn MILP-Based Aging-Aware Routing Algorithm for NoCs
An MILP-Based Aging-Aware Routing Algorithm for NoCs Kshitij Bhardwaj Koushik Chakraborty Sanghamitra Roy BRIDGE Lab Electrical and Computer Engineering Utah State University 1 Outline NoC basics Motivation
More informationLecture 3: Flow-Control
High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor
More information4. Specifications and Additional Information
4. Specifications and Additional Information AGX52004-1.0 8B/10B Code This section provides information about the data and control codes for Arria GX devices. Code Notation The 8B/10B data and control
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationSwitching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.
Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine
More information1/12/11. ECE 1749H: Interconnec3on Networks for Parallel Computer Architectures. Introduc3on. Interconnec3on Networks Introduc3on
ECE 1749H: Interconnec3on Networks for Parallel Computer Architectures Introduc3on Prof. Natalie Enright Jerger Winter 2011 ECE 1749H: Interconnec3on Networks (Enright Jerger) 1 Interconnec3on Networks
More informationFast-Response Multipath Routing Policy for High-Speed Interconnection Networks
HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope
More informationROUTING AND FORWARDING
ROUTING AND FORWARDING Mario Baldi www.baldi.info Routing - 1 Copyright Notice This set of transparencies, hereinafter referred to as slides, is protected by copyright laws and provisions of International
More informationBroadcas(ng Video in Dense g Networks Using Applica(on FEC and Mul(cast
Broadcas(ng Video in Dense 802.11g Networks Using Applica(on FEC and Mul(cast Last update: 6-10-2011 Dr James Martin School of Computing Clemson University Clemson, SC jim.martin@cs.clemson.edu Dr James
More informationEffect of Router Buffers on Stability of Internet Conges8on Control Algorithms
Effect of Router Buffers on Stability of Internet Conges8on Control Algorithms Somayeh Sojoudi Steven Low John Doyle Oct 27, 2011 1 Resource alloca+on problem Objec8ve Fair assignment of rates to the users
More informationCAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC
QoS Aware BiNoC Architecture Shih-Hsin Lo, Ying-Cherng Lan, Hsin-Hsien Hsien Yeh, Wen-Chung Tsai, Yu-Hen Hu, and Sao-Jie Chen Ying-Cherng Lan CAD System Lab Graduate Institute of Electronics Engineering
More informationBandwidth-aware routing algorithms for networks-on-chip platforms M. Palesi 1 S. Kumar 2 V. Catania 1
Published in IET Computers & Digital Techniques Received on 6th July 2008 Revised on 2nd April 2009 In Special Issue on Networks on Chip ISSN 1751-8601 Bandwidth-aware routing algorithms for networks-on-chip
More informationProAc&ve Rou&ng In Scalable Data Centers with PARIS
ProAc&ve Rou&ng In Scalable Data Centers with PARIS Theophilus Benson Duke University Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Data Center Networks
More informationLow-Power Interconnection Networks
Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:
More informationMULTIPATH ROUTER ARCHITECTURES TO REDUCE LATENCY IN NETWORK-ON-CHIPS. A Thesis HRISHIKESH NANDKISHOR DESHPANDE
MULTIPATH ROUTER ARCHITECTURES TO REDUCE LATENCY IN NETWORK-ON-CHIPS A Thesis by HRISHIKESH NANDKISHOR DESHPANDE Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment
More informationDeadlock-free Packet Switching
Deadlock-free Packet Switching CS60002: Distributed Systems Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Store and forward deadlock s t u v Buffer-size = 5 Node
More informationhashfs Applying Hashing to Op2mize File Systems for Small File Reads
hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design
More informationUNH-IOL MIPI Alliance Test Program
DSI Receiver Protocol Conformance Test Report UNH-IOL 121 Technology Drive, Suite 2 Durham, NH 03824 +1-603-862-0090 mipilab@iol.unh.edu +1-603-862-0701 Engineer Name engineer@company.com Panel Company
More informationOPTIMAL ROUTING VS. ROUTE REFLECTOR VNF - RECONCILE THE FIRE WITH WATER
OPTIMAL ROUTING VS. ROUTE REFLECTOR VNF - RECONCILE THE FIRE WITH WATER Rafal Jan Szarecki #JNCIE136 Solu9on Architect, Juniper Networks. AGENDA Route Reflector VNF - goals Route Reflector challenges and
More informationNext hop in rou-ng Summary of Future Internet WP1 work. Hannu Flinck
Next hop in rou-ng Summary of Future Internet WP1 work Hannu Flinck Original focus on Rou-ng Scalability Mo$va$on: Internet Architecture Board stated (in RFC 4984): rou-ng scalability is the most important
More informationACCELERATING COMMUNICATION IN ON-CHIP INTERCONNECTION NETWORKS. A Dissertation MIN SEON AHN
ACCELERATING COMMUNICATION IN ON-CHIP INTERCONNECTION NETWORKS A Dissertation by MIN SEON AHN Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements
More informationMRT- FRR: Architecture, Algorithms, Analysis, and Extensions
MRT- FRR: Architecture, Algorithms, Analysis, and Extensions Alia Atlas, Gabor Enyedi, Andras Csaszar, Bob Kebler, Shraddha Hegde, Chris Bowers, Jeff Tantsura, Abishek Gopalan, Kishore Tiruveedhula, Russ
More informationRouting Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)
Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup
More informationDeadlock. Reading. Ensuring Packet Delivery. Overview: The Problem
Reading W. Dally, C. Seitz, Deadlock-Free Message Routing on Multiprocessor Interconnection Networks,, IEEE TC, May 1987 Deadlock F. Silla, and J. Duato, Improving the Efficiency of Adaptive Routing in
More informationFlow Control can be viewed as a problem of
NOC Flow Control 1 Flow Control Flow Control determines how the resources of a network, such as channel bandwidth and buffer capacity are allocated to packets traversing a network Goal is to use resources
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationName-Based Content Routing in Information Centric Networks Using Distance Information
Name-Based Content Routing in Information Centric Networks Using Distance Information J.J. Garcia-Luna-Aceves Palo Alto Research Center UC Santa Cruz jj@soe.ucsc.edu Origins of Routing for Packet Switching
More informationRouting Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip
Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Nauman Jalil, Adnan Qureshi, Furqan Khan, and Sohaib Ayyaz Qazi Abstract
More information