TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks
|
|
- Esmond Powers
- 5 years ago
- Views:
Transcription
1 TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks Gwangsun Kim Arm Research Hayoung Choi, John Kim KAIST
2 High-radix Networks Dragonfly network in Cray XC30 system 1D Flattened butterfly (fully connected) Image source: Cray A large number of narrow links low network diameter, high path diversity Energy-proportionality can be challenging Links use high-speed signaling High energy consumption regardless of load ( Idle packets transmitted) 2D Flattened butterfly (fully connected within each dimension)
3 Motivation Data center networks can be significantly underutilized Resources provisioned to meet peak demand Low link utilization measured by Facebook Network energy waste can be high at low system utilization Exploit link power-gating opportunity in high-radix routers [Rot et al., SIGCOMM 15] [Abts et al., ISCA 10]
4 Motivation Data center networks can be significantly underutilized Resources provisioned to meet peak demand Low link utilization measured by Facebook Network energy waste can be high at low system utilization Exploit link power-gating opportunity in high-radix routers Link power-gating challenges: - How to maximize power reduction? - How to keep network connected? - How to achieve scalability? - How to minimize performance impact? - How to load-balance network? [Rot et al., SIGCOMM 15] [Abts et al., ISCA 10]
5 Contents Background / Motivation Traffic consolidation Maintaining connectivity Criteria for selecting links to power-gate Power-aware load-balanced routing Evaluation Conclusion
6 Traffic Consolidation Energy-proportionality requires aggressive power-gating Consolidate flows onto fewer links thru non-minimal routing Flow 0: 50% link util. Flow 1: 25% link util. Router Router Flow 0: 50% link util. Traffic consolidation Flow 1: 50% link util. Flow 2: 50% link util. Flow 2: 50% link util. Flow 1: 25% link util.
7 Subnetwork-based Distributed Approach Subnetwork: routers that are fully connected in a dimension Independently manage power with local information R4 R5 R6 R7 R8 R Fully connected within each dimension 2D Flattened butterfly
8 Subnetwork-based Distributed Approach Subnetwork: routers that are fully connected in a dimension Independently manage power with local information R4 R5 R6 R7 Row subnetworks R8 R D Flattened butterfly
9 Subnetwork-based Distributed Approach Subnetwork: routers that are fully connected in a dimension Independently manage power with local information Column subnetworks R4 R5 R6 R7 R8 R D Flattened butterfly
10 Subnetwork-based Distributed Approach Subnetwork: routers that are fully connected in a dimension Independently manage power with local information Consists of a single subnetwork R4 R5 R6 R7 R8 R D Flattened butterfly 1D Flattened butterfly
11 Root Network Maintaining Connectivity Constantly checking connectivity can incur high overhead Subset of links that are always ON to keep all nodes connected Star topology minimal # of links and low network diameter Root network links Other links Rearrange the root network R7 R4 R6 R5 Root network for 1D Flattened butterfly Max. hop count = 2
12 Root Network for Higher Dimensions Star topology is formed within each subnetwork Root network links Other links R4 R5 R6 R7 R8 R Root network for 2D Flattened butterfly
13 Root Network for Higher Dimensions Star topology is formed within each subnetwork Further reducing ON links? too complex, little added benefit (4.3% for radix-64 routers) Root network links Other links R4 R5 R6 R7 R8 R Root network for 2D Flattened butterfly
14 Root Network for Higher Dimensions Star topology is formed within each subnetwork Further reducing ON links? too complex, little added benefit (4.3% for radix-64 routers) Other links can be power-gated without affecting connectivity Root network links Other links R4 R5 R6 R7 R8 R Root network for 2D Flattened butterfly
15 Observation on Maximizing Path Diversity Which links should be ON for high path diversity? OFF ON (root network) ON (additional links) Provides better path diversity Approach 1: distribute ON links Approach 2: concentrate ON links
16 Hub Routers for High Path Diversity Hub routers created small-world network Similarly, airlines create hub airports to reduce cost Quantitative results: 1D Flattened butterfly (32 routers, 1024 nodes) No non-minimal paths with more than 2 hops Random distribution: average from 10,000 samples Normalized number of total paths Hub airport Concentrate to "hub" routers Total # of paths improved by 1.9x Randomly distribute Fraction of active links (%) Edge: Direct flights by United Airlines
17 Hub Routers for High Path Diversity Hub routers created small-world network Similarly, airlines create hub airports to reduce cost Quantitative results: 1D Flattened butterfly (32 routers, 1024 nodes) No non-minimal paths with more than 2 hops Normalized number of total paths TCEP concentrates ON links to a small number of hub routers. Random distribution: average from 10,000 samples Hub airport Concentrate to "hub" routers Total # of paths improved by 1.9x Randomly distribute Fraction of active links (%) Edge: Direct flights by United Airlines
18 Hub Routers for High Path Diversity Hub routers created small-world network Similarly, airlines create hub airports to reduce cost Quantitative results: 1D Flattened butterfly (32 routers, 1024 nodes) No non-minimal paths with more than 2 hops Random distribution: average from 10,000 samples Normalized number of total paths Concentrate to "hub" routers Total # of paths improved by 1.9x Randomly distribute Fraction of active links (%)
19 Observation on Minimizing Impact on Network Differentiate the type of traffic (minimally vs. non-minimally routed) Making power-gating decision at OFF Minimally routed ON Power-gate Candidate 1 Increased hop count & BW usage Non-minimally routed Power-gate Candidate 2 The same hop count & BW usage
20 Observation on Minimizing Impact on Network Differentiate the type of traffic (minimally vs. non-minimally routed) OFF Making power-gating decision at Minimally routed ON Power-gate Candidate 1 Increased hop count & BW usage TCEP prioritizes power-gating links with the least amount of minimally routed traffic. Non-minimally routed Power-gate Candidate 2 The same hop count & BW usage
21 Problem with Load-balanced Routing No global link state information non-minimal path can become significantly longer Baseline non-minimal routing: Source Intermediate (INTM) router Destination SRC R4 R5 R6 Congestion R7 R8 R9 0 INTM DEST Baseline (no power-gating) 4 hops
22 Problem with Load-balanced Routing No global link state information non-minimal path can become significantly longer Baseline non-minimal routing: Source Intermediate (INTM) router Destination Some links are OFF! SRC R4 R5 R6 Congestion R7 SRC R4 R5 R6 Congestion R7 INTM INTM R8 R9 0 1 R8 R DEST Baseline (no power-gating) 4 hops With power-gating 5 DEST 8 hops
23 Problem with Load-balanced Routing No global link state information non-minimal path can become significantly longer Baseline non-minimal routing: Source Intermediate (INTM) router Destination SRC R4 R5 R6 Congestion R7 SRC R4 R5 R6 Congestion R7 INTM INTM R8 R9 0 1 R8 R DEST Baseline (no power-gating) 4 hops With power-gating 5 DEST 8 hops
24 Proposed PAL Routing PAL: Power-Aware progressive Load-balanced (Routing) Instead of global randomization, locally randomize Compared to the baseline load-balanced routing: The same maximum hop count (2 hops in each dimension) No additional virtual channels (dimension-order routing) INTM_Y INTM_X R4 SRC R5 R6 Congestion R7 DEST_X R8 R hops DEST
25 Other Issues Addressed in the Paper Challenge: the observations can lead to different decisions We propose a low-complexity algorithm to reconcile them Only one link turned on/off from a router per epoch avoid supply voltage shift Routers keep track of non-minimal paths within a subnetwork More details in the paper..
26 Methodology Booksim: cycle-accurate interconnection network simulator SST/Macro + Booksim for real workload evaluation Previous work compared: SLaC (Staged Laser Control) [HPCA 16] A stage corresponds to a row of routers in 2D Flattened butterfly Turn on/off a stage based in a coarse-grained manner Network configuration: Parameter Topology Virtual channel Link wake-up delay Value 512-node 2D Flattened butterfly 6 VCs for baseline, 7 VCs for TCEP and SLaC 1 µs (=epoch length. Sensitivity study in paper)
27 Synthetic Traffic Results Performance Average packet latency (cycles) Energy Normalized energy per flit Uniform random traffic pattern Baseline SLaC TCEP Injection rate (flits/cycle/node) Uniform random traffic pattern Baseline SLaC TCEP Injection rate (flits/cycle/node) Average packet latency (cycles) Normalized energy per flit Bit reverse traffic pattern Baseline SLaC TCEP 7x throughput difference Injection rate (flits/cycle/node) Bit reverse traffic pattern Baseline SLaC TCEP Up to 73% lower energy Injection rate (flits/cycle/node)
28 Multi-workload Scenario Results Two different batch workloads running simultaneously High and low injection rates 100 random node mappings Energy Ratio (SLaC/TCEP) Uniform random traffic pattern % lower energy by TCEP Mapping number Runtime was similar (within 0.3%) Random permutation traffic pattern Energy Ratio (SLaC/TCEP) % lower energy by TCEP Mapping number TCEP was x faster than SLaC
29 Real Workload Results Packet latency SLaC significantly increased latency for some workloads Normalized average packet latency Network energy Baseline SLaC TCEP 4.51 Significant energy savings by both SLaC and TCEP HILO BoxMG MG NB FB BigFFT GMEAN Overall, 46% lower compared to SLaC Normalized energy Baseline SLaC TCEP HILO BoxMG MG NB FB BigFFT GMEAN
30 Conclusion TCEP consolidates traffic to proactively power-gate links Key observations Concentrating ON links to hub routers for high path diversity Differentiate minimally and non-minimally routed traffic PAL (Power-Aware progressive Load-balanced) routing Load-balanced routing without global link state information Keep the max. hop count the same as baseline without power-gating Results (compared to SLaC) For adversarial traffic patterns, up to 7x higher throughput For multi-workload scenarios, up to ~70% lower energy and runtime
Contention-based Congestion Management in Large-Scale Networks
Contention-based Congestion Management in Large-Scale Networks Gwangsun Kim, Changhyun Kim, Jiyun Jeong, Mike Parker, John Kim KAIST Intel Corp. {gskim, nangigs, cjy9037, jjk12}@kaist.ac.kr mike.a.parker@intel.com
More informationANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS
ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS Mariano Benito Pablo Fuentes Enrique Vallejo Ramón Beivide With support from: 4th IEEE International Workshop of High-Perfomance Interconnection
More informationOFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management
Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly
More informationModeling UGAL routing on the Dragonfly topology
Modeling UGAL routing on the Dragonfly topology Md Atiqul Mollah, Peyman Faizian, Md Shafayat Rahman, Xin Yuan Florida State University Scott Pakin, Michael Lang Los Alamos National Laboratory Motivation
More informationTraffic Pattern-based
Traffic Pattern-based C Adaptive Routing in Dragonfly Networks Peyman Faizian, Shafayat Rahman Atiqul Mollah, Xin Yuan Florida State University Scott Pakin Mike Lang Los Alamos National Laboratory Motivation
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More informationQuest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling
Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Bhavya K. Daya, Li-Shiuan Peh, Anantha P. Chandrakasan Dept. of Electrical Engineering and Computer
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationSTLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip
STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip Codesign for Tiled Manycore Systems Mingyu Wang and Zhaolin Li Institute of Microelectronics, Tsinghua University, Beijing 100084,
More informationTowards Energy Proportionality for Large-Scale Latency-Critical Workloads
Towards Energy Proportionality for Large-Scale Latency-Critical Workloads David Lo *, Liqun Cheng *, Rama Govindaraju *, Luiz André Barroso *, Christos Kozyrakis Stanford University * Google Inc. 2012
More informationPseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
Department of Computer Science and Engineering, Texas A&M University Technical eport #2010-3-1 seudo-circuit: Accelerating Communication for On-Chip Interconnection Networks Minseon Ahn, Eun Jung Kim Department
More informationSlim Fly: A Cost Effective Low-Diameter Network Topology
TORSTEN HOEFLER, MACIEJ BESTA Slim Fly: A Cost Effective Low-Diameter Network Topology Images belong to their creator! NETWORKS, LIMITS, AND DESIGN SPACE Networks cost 25-30% of a large supercomputer Hard
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationA Cost and Scalability Comparison of the Dragonfly versus the Fat Tree. Frank Olaf Sem-Jacobsen Simula Research Laboratory
A Cost and Scalability Comparison of the Dragonfly versus the Fat Tree Frank Olaf Sem-Jacobsen frankose@simula.no Simula Research Laboratory HPC Advisory Council Workshop Barcelona, Spain, September 12,
More informationMulticomputer distributed system LECTURE 8
Multicomputer distributed system LECTURE 8 DR. SAMMAN H. AMEEN 1 Wide area network (WAN); A WAN connects a large number of computers that are spread over large geographic distances. It can span sites in
More informationAchieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation
Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationLecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control
Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection
More informationPOLYMORPHIC ON-CHIP NETWORKS
POLYMORPHIC ON-CHIP NETWORKS Martha Mercaldi Kim, John D. Davis*, Mark Oskin, Todd Austin** University of Washington *Microsoft Research, Silicon Valley ** University of Michigan On-Chip Network Selection
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationLecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control
Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,
More informationDynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers
Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California
More informationLow-Power Interconnection Networks
Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:
More informationMinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect
MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect Chris Fallin, Greg Nazario, Xiangyao Yu*, Kevin Chang, Rachata Ausavarungnirun, Onur Mutlu Carnegie Mellon University *CMU
More informationInterconnection Networks
Lecture 18: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of these slides were created by Michael Papamichael This lecture is partially
More informationToward Runtime Power Management of Exascale Networks by On/Off Control of Links
Toward Runtime Power Management of Exascale Networks by On/Off Control of Links, Nikhil Jain, Laxmikant Kale University of Illinois at Urbana-Champaign HPPAC May 20, 2013 1 Power challenge Power is a major
More informationECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts
ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationLecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance
Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,
More informationTDT Appendix E Interconnection Networks
TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages
More informationPRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS. A Thesis SONALI MAHAPATRA
PRIORITY BASED SWITCH ALLOCATOR IN ADAPTIVE PHYSICAL CHANNEL REGULATOR FOR ON CHIP INTERCONNECTS A Thesis by SONALI MAHAPATRA Submitted to the Office of Graduate and Professional Studies of Texas A&M University
More informationAdvanced Parallel Architecture. Annalisa Massini /2017
Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing
More informationA Thermal-aware Application specific Routing Algorithm for Network-on-chip Design
A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design Zhi-Liang Qian and Chi-Ying Tsui VLSI Research Laboratory Department of Electronic and Computer Engineering The Hong Kong
More informationThe Impact of Optics on HPC System Interconnects
The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes
More informationChapter 4 : Butterfly Networks
1 Chapter 4 : Butterfly Networks Structure of a butterfly network Isomorphism Channel load and throughput Optimization Path diversity Case study: BBN network 2 Structure of a butterfly network A K-ary
More informationACCELERATING COMMUNICATION IN ON-CHIP INTERCONNECTION NETWORKS. A Dissertation MIN SEON AHN
ACCELERATING COMMUNICATION IN ON-CHIP INTERCONNECTION NETWORKS A Dissertation by MIN SEON AHN Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements
More informationParallel Computing Platforms
Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics
More informationChapter 6 Connecting Device
Computer Networks Al-Mustansiryah University Elec. Eng. Department College of Engineering Fourth Year Class Chapter 6 Connecting Device 6.1 Functions of network devices Separating (connecting) networks
More informationCatnap: Energy Proportional Multiple Network-on-Chip
Catnap: Energy Proportional Multiple Network-on-Chip Reetuparna Das University of Michigan reetudas@umich.edu Satish Narayanasamy University of Michigan nsatish@umich.edu Sudhir K. Satpathy University
More informationMulti-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture
The 51st Annual IEEE/ACM International Symposium on Microarchitecture Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture Byungchul Hong Yeonju Ro John Kim FuriosaAI Samsung
More informationExpeditus: Congestion-Aware Load Balancing in Clos Data Center Networks
Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Peng Wang, Hong Xu, Zhixiong Niu, Dongsu Han, Yongqiang Xiong ACM SoCC 2016, Oct 5-7, Santa Clara Motivation Datacenter networks
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationUnicast Routing in Mobile Ad Hoc Networks. Dr. Ashikur Rahman CSE 6811: Wireless Ad hoc Networks
Unicast Routing in Mobile Ad Hoc Networks 1 Routing problem 2 Responsibility of a routing protocol Determining an optimal way to find optimal routes Determining a feasible path to a destination based on
More informationChapter 7 CONCLUSION
97 Chapter 7 CONCLUSION 7.1. Introduction A Mobile Ad-hoc Network (MANET) could be considered as network of mobile nodes which communicate with each other without any fixed infrastructure. The nodes in
More informationAn Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing
An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing Pedro Yébenes 1, Jesús Escudero-Sahuquillo 1, Pedro J. García 1, Francisco
More informationin Oblivious Routing
Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim, Myong Hyon Cho, Michel Kinsy, Tina Wen, Mieszko Lis G. Edward Suh (Cornell) Srinivas Devadas MIT Computer Science and Artificial Intelligence
More informationOpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel
OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) hyoukjun@gatech.edu April
More informationGlobal Adaptive Routing Algorithm Without Additional Congestion Propagation Network
1 Global Adaptive Routing Algorithm Without Additional Congestion ropagation Network Shaoli Liu, Yunji Chen, Tianshi Chen, Ling Li, Chao Lu Institute of Computing Technology, Chinese Academy of Sciences
More informationFIST: A Fast, Lightweight, FPGA-Friendly Packet Latency Estimator for NoC Modeling in Full-System Simulations
FIST: A Fast, Lightweight, FPGA-Friendly Packet Latency Estimator for oc Modeling in Full-System Simulations Michael K. Papamichael, James C. Hoe, Onur Mutlu papamix@cs.cmu.edu, jhoe@ece.cmu.edu, onur@cmu.edu
More informationInterconnection Networks
Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact
More informationDesign and Implementation of Multistage Interconnection Networks for SoC Networks
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa
More informationMessage Passing Models and Multicomputer distributed system LECTURE 7
Message Passing Models and Multicomputer distributed system LECTURE 7 DR SAMMAN H AMEEN 1 Node Node Node Node Node Node Message-passing direct network interconnection Node Node Node Node Node Node PAGE
More informationThe final publication is available at
Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The
More informationCS575 Parallel Processing
CS575 Parallel Processing Lecture three: Interconnection Networks Wim Bohm, CSU Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.
More informationResult Analysis of Overcoming Far-end Congestion in Large-Scale Networks
Result Analysis of Overcoming Far-end Congestion in Large-Scale Networks Shalini Vyas 1, Richa Chauhan 2 M.Tech Scholar, Department of ECE, Oriental Institute of science and technology, Bhopal, M.P., India
More informationLecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011
Lecture 6: Overlay Networks CS 598: Advanced Internetworking Matthew Caesar February 15, 2011 1 Overlay networks: Motivations Protocol changes in the network happen very slowly Why? Internet is shared
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationEarly Transition for Fully Adaptive Routing Algorithms in On-Chip Interconnection Networks
Technical Report #2012-2-1, Department of Computer Science and Engineering, Texas A&M University Early Transition for Fully Adaptive Routing Algorithms in On-Chip Interconnection Networks Minseon Ahn,
More informationLecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels
Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a
More informationData Criticality in Network-On-Chip Design. Joshua San Miguel Natalie Enright Jerger
Data Criticality in Network-On-Chip Design Joshua San Miguel Natalie Enright Jerger Network-On-Chip Efficiency Efficiency is the ability to produce results with the least amount of waste. Wasted time Wasted
More informationFast-Response Multipath Routing Policy for High-Speed Interconnection Networks
HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope
More informationLecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background
Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation
More informationRandomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks
2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam
More informationDestination-Based Adaptive Routing on 2D Mesh Networks
Destination-Based Adaptive Routing on 2D Mesh Networks Rohit Sunkam Ramanujam University of California, San Diego rsunkamr@ucsdedu Bill Lin University of California, San Diego billlin@eceucsdedu ABSTRACT
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationThomas Moscibroda Microsoft Research. Onur Mutlu CMU
Thomas Moscibroda Microsoft Research Onur Mutlu CMU CPU+L1 CPU+L1 CPU+L1 CPU+L1 Multi-core Chip Cache -Bank Cache -Bank Cache -Bank Cache -Bank CPU+L1 CPU+L1 CPU+L1 CPU+L1 Accelerator, etc Cache -Bank
More informationFlexVC: Flexible Virtual Channel Management in Low-Diameter Networks
This is an earlier accepted version; a final version of this work will be published in the Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017). Copyright
More informationInterconnection Networks
Lecture 15: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Credit: some slides created by Michael Papamichael, others based on slides from Onur Mutlu
More informationBasic Low Level Concepts
Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock
More informationUltra-Fast NoC Emulation on a Single FPGA
The 25 th International Conference on Field-Programmable Logic and Applications (FPL 2015) September 3, 2015 Ultra-Fast NoC Emulation on a Single FPGA Thiem Van Chu, Shimpei Sato, and Kenji Kise Tokyo
More informationSynchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom
ISCA 2018 Session 8B: Interconnection Networks Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom Aniruddh Ramrakhyani Georgia Tech (aniruddh@gatech.edu) Tushar
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationLecture 16: On-Chip Networks. Topics: Cache networks, NoC basics
Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality
More informationLecture 15: NoC Innovations. Today: power and performance innovations for NoCs
Lecture 15: NoC Innovations Today: power and performance innovations for NoCs 1 Network Power Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO 03, Princeton Energy for a flit
More informationPhastlane: A Rapid Transit Optical Routing Network
Phastlane: A Rapid Transit Optical Routing Network Mark Cianchetti, Joseph Kerekes, and David Albonesi Computer Systems Laboratory Cornell University The Interconnect Bottleneck Future processors: tens
More informationInterconnect Technology and Computational Speed
Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented
More informationDesign of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture
Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and
More informationFault-tolerant & Adaptive Stochastic Routing Algorithm. for Network-on-Chip. Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie
Fault-tolerant & Adaptive Stochastic Routing Algorithm for Network-on-Chip Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie Idea & solution to be investigated by the project There are some options
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationHomework Assignment #1: Topology Kelly Shaw
EE482 Advanced Computer Organization Spring 2001 Professor W. J. Dally Homework Assignment #1: Topology Kelly Shaw As we have not discussed routing or flow control yet, throughout this problem set assume
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationAdaptive Routing. Claudio Brunelli Adaptive Routing Institute of Digital and Computer Systems / TKT-9636
1 Adaptive Routing Adaptive Routing Basics Minimal Adaptive Routing Fully Adaptive Routing Load-Balanced Adaptive Routing Search-Based Routing Case Study: Adapted Routing in the Thinking Machines CM-5
More informationAsynchronous Bypass Channel Routers
1 Asynchronous Bypass Channel Routers Tushar N. K. Jain, Paul V. Gratz, Alex Sprintson, Gwan Choi Department of Electrical and Computer Engineering, Texas A&M University {tnj07,pgratz,spalex,gchoi}@tamu.edu
More informationSURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS
SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS Chandrika D.N 1, Nirmala. L 2 1 M.Tech Scholar, 2 Sr. Asst. Prof, Department of electronics and communication engineering, REVA Institute
More informationHousekeeping. Fall /5 CptS/EE 555 1
Housekeeping Lab access HW turn-in Jin? Class preparation for next time: look at the section on CRCs 2.4.3. Be prepared to explain how/why the shift register implements the CRC Skip Token Rings section
More informationNetwork Control and Signalling
Network Control and Signalling 1. Introduction 2. Fundamentals and design principles 3. Network architecture and topology 4. Network control and signalling 5. Network components 5.1 links 5.2 switches
More informationA New Energy-Aware Routing Protocol for. Improving Path Stability in Ad-hoc Networks
Contemporary Engineering Sciences, Vol. 8, 2015, no. 19, 859-864 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2015.57207 A New Energy-Aware Routing Protocol for Improving Path Stability
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:
More informationChapter 7 Slicing and Dicing
1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes
More informationHardware Evolution in Data Centers
Hardware Evolution in Data Centers 2004 2008 2011 2000 2013 2014 Trend towards customization Increase work done per dollar (CapEx + OpEx) Paolo Costa Rethinking the Network Stack for Rack-scale Computers
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationENERGY-EFFICIENT TRAFFIC MERGING FOR DATACENTER NETWORKS
ENERGY-EFFICIENT TRAFFIC MERGING FOR DATACENTER NETWORKS 1 RUSHIKESH MADHUKAR BAGE, 2 GAURAV DIGAMBAR DOIPHODE 1, 2 Graduate Student, Department Of Computer Science, Mumbai, Maharashtra, India 1 Email
More informationEC 513 Computer Architecture
EC 513 Computer Architecture On-chip Networking Prof. Michel A. Kinsy Virtual Channel Router VC 0 Routing Computation Virtual Channel Allocator Switch Allocator Input Ports VC x VC 0 VC x It s a system
More informationThe Impact of Optics on HPC System Interconnects
The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott map@cray.com, sscott@cray.com Cray Inc. Index Terms interconnection network, high-radix router, network topology, optical interconnect
More informationHigh Performance Datacenter Networks
M & C Morgan & Claypool Publishers High Performance Datacenter Networks Architectures, Algorithms, and Opportunity Dennis Abts John Kim SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE Mark D. Hill, Series
More informationSIGNET: NETWORK-ON-CHIP FILTERING FOR COARSE VECTOR DIRECTORIES. Natalie Enright Jerger University of Toronto
SIGNET: NETWORK-ON-CHIP FILTERING FOR COARSE VECTOR DIRECTORIES University of Toronto Interaction of Coherence and Network 2 Cache coherence protocol drives network-on-chip traffic Scalable coherence protocols
More informationCSE 461 Routing. Routing. Focus: Distance-vector and link-state Shortest path routing Key properties of schemes
CSE 46 Routing Routing Focus: How to find and set up paths through a network Distance-vector and link-state Shortest path routing Key properties of schemes Application Transport Network Link Physical Forwarding
More information