Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Similar documents
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

ECE 669 Parallel Computer Architecture

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms

Routing Algorithms. Review

Interconnection Networks

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture: Interconnection Networks

[ ] In earlier lectures, we have seen that switches in an interconnection network connect inputs to outputs, usually with some kind buffering.

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

Deadlock and Livelock. Maurizio Palesi

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

TDT Appendix E Interconnection Networks

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Communication Performance in Network-on-Chips

Interconnection Network

Deadlock and Router Micro-Architecture

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Lecture 3: Flow-Control

Basic Low Level Concepts

NOC Deadlock and Livelock

Lecture 18: Communication Models and Architectures: Interconnection Networks

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

CMSC 611: Advanced. Interconnection Networks

Interconnection Networks

Interprocessor Communication. Basics of Network Routing

Interconnection Networks

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

This Lecture. BUS Computer Facilities Network Management. Switching Network. Simple Switching Network

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS

4. Networks. in parallel computers. Advances in Computer Architecture

Deadlock-free XY-YX router for on-chip interconnection network

CSCI Computer Networks

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

EE 382C Interconnection Networks

Thomas Moscibroda Microsoft Research. Onur Mutlu CMU

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Performance Analysis of a Minimal Adaptive Router

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

Packet Switch Architecture

Packet Switch Architecture

Routing and Deadlock

The Network Layer and Routers

Interconnection Networks

MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect

Lecture 7: Flow Control - I

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics

Interconnect Technology and Computational Speed

Local Area Network Overview

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes

Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling

Distributed Memory Machines. Distributed Memory Machines

Reminder: Datalink Functions Computer Networking. Datalink Architectures

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?

Adaptive Routing. Claudio Brunelli Adaptive Routing Institute of Digital and Computer Systems / TKT-9636

Lecture 23: Router Design

Abstract. Paper organization

Deadlock: Part II. Reading Assignment. Deadlock: A Closer Look. Types of Deadlock

Flow Control can be viewed as a problem of

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

EECS 570 Final Exam - SOLUTIONS Winter 2015

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 552 Computer Networks

EE 6900: Interconnection Networks for HPC Systems Fall 2016

Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing

CompSci 356: Computer Network Architectures. Lecture 7: Switching technologies Chapter 3.1. Xiaowei Yang

ET4254 Communications and Networking 1

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.

L14-15 Scalable Interconnection Networks. Scalable, High Performance Network

Boosting the Performance of Myrinet Networks

NOC: Networks on Chip SoC Interconnection Structures

18-740/640 Computer Architecture Lecture 16: Interconnection Networks. Prof. Onur Mutlu Carnegie Mellon University Fall 2015, 11/4/2015

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1

Routing in packet-switching networks

ES1 An Introduction to On-chip Networks

Contention-based Congestion Management in Large-Scale Networks

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

A Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects

CONGESTION AWARE ADAPTIVE ROUTING FOR NETWORK-ON-CHIP COMMUNICATION. Stephen Chui Bachelor of Engineering Ryerson University, 2012.

Interconnection Network Project EE482 Advanced Computer Organization May 28, 1999

Basics (cont.) Characteristics of data communication technologies OSI-Model

Lecture 15: Networks & Interconnect Interface, Switches, Routing, Examples Professor David A. Patterson Computer Science 252 Fall 1996

A Preferred Service Architecture for Payload Data Flows. Ray Gilstrap, Thom Stone, Ken Freeman

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1

Transcription:

Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup 4) Adaptive route based on network state (e.g., contention) 30

(1) Arithmetic Routing For regular topology, use simple arithmetic to determine route E.g., 3D Torus Packet header contains signed offset to destination (per dimension) At each hop, switch +/- to reduce offset in a dimension When x == 0 and y == 0, then at correct processor (0,1,1) (1,1,1) (0,0,1) (1,0,1) Drawbacks Requires ALU in switch Must re-compute CRC at each hop (0,1,0) (0,0,0) (1,0,0) (1,1,0) 31

(2) Source Based & (3) Table Lookup Routing Source Based Source specifies output port for each switch in route Very simple switches No control state Strip output port off header Myrinet used this Can t be made adaptive Table Lookup Very small header, index into table for output port Big tables, must be kept up-to-date 32

Deterministic vs. Adaptive Routing Deterministic (static) follows a pre-specified route K-ary d-cube: dimension-order routing» (x1, y1) (x2, y2)» First Dx = x2 - x1,» Then Dy = y2 - y1, Tree: common ancestor 010 110 Adaptive route determined by contention for output port 011 100 111 000 001 101 33

(4) Adaptive Routing Essential for fault tolerance At least multipath Can improve utilization of the network Simple deterministic algorithms easily run into bad permutations Fully/partially adaptive, minimal/non-minimal Can introduce complexity or anomalies A little adaptation goes a long way! 34

Hot Potato Routing Every cycle, each switch takes each input and routes it to an output But not necessarily to the desired output No switch buffering! Possibility of livelock if no precautions taken E.g., could grant priority based on age of packet Variants also known as deflection routing or mad postman routing 35

Deadlock Necessary conditions to achieve deadlock Use more than one resource Not willing to release resource in use Cycle in order of recourse use Guri Sohi 36

Two Causes of Deadlock Endpoint Deadlock full of requests Response P1 P2 Response full of requests Switch Deadlock full of messages switch1 switch2 Message M1 Message M2 full of messages 37

Avoiding Deadlock Simple but wasteful solution: full buffering But it s rare that we ever need full buffering More efficient solution: virtual channels (networks) Endpoint deadlock solution: virtual networks Need a virtual network per type of message Switch deadlock solution #1: virtual channels Switch deadlock solution #2: deadlock-free routing 38

Virtual Channels Need some number of virtual channels per virtual network, which depends on network topology and routing scheme Not to be confused with virtual cut-through Add buffers so flits of wormhole packets can be interleaved Optional paper by Dally on virtual channels (see course website) Upshot: total #virtual channels equals product of #virtual networks times #virtual channels for avoiding switch deadlock 39

Up*-Down* Deadlock Free Routing For spanning trees (superposed on any topology) Route up, make one turn, route down Turn Model Routing Restrict order of turns» West first» North last» Negative first Can increase number of hops 40

Minimal turn restrictions in 2D +y -x +x West-first north-last -y negative first 41

Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 42

Circuit Switching Buffer-less Flow Control Establish route then send data Like the telephone system No buffers needed This approach differs from packet switching, which is what we ve implicitly assumed until this slide Hot Potato Routing No buffers needed, since all incoming packets get sent out on some link without waiting Packet Discarding If two packets contend for same resource, one gets dropped relies on higher-order mechanism (retry) 43

Buffered Flow Control Packet switched networks do not reserve bandwidth, which can lead to contention Solution: prevent packets from entering until contention is reduced (e.g., metering lights) Flow control: between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet Link-level: flow control done on per-link basis End-to-end: flow control done over entire path length 44

Link-Level Flow Control Ready Data Transfer single flit when receiver is ready Could have long links with many flits in flight 45

Credit-based (Window) Flow Control Receiver gives N credits to sender Sender decrements count Stops sending if zero Receiver sends back credit as it drains its buffer Bundle credits to reduce overhead Must account for link latency 46

Water Level High water, low water Stop & go sent back to source switch (Myrinet) Can send redundant stop/go Incoming phits Stop Go Outgoing phits 47

Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 48

A Generic Switch At minimum, must route inputs to outputs Receiver Input Buffer Output Buffer Transmitter Input Ports Crossbar Output Ports Control Routing, Scheduling 49

Switch Operation Each packet (flit) traverses the switch s pipeline Arrive in input buffer and wait to get to head of queue Compute route (once per packet) Allocate virtual channel (once per packet) Allocate crossbar and output buffer entry Traverse crossbar Wait in output buffer to be allocated output link Switch is like very simple in-order processor pipeline Packet can stall at any stage Only head flit, though, can stall when computing route or allocating virtual channel 50

Switch Buffering Must absorb burstiness in arriving traffic Unless using hot potato routing Must also hold flits that are stalled Option #1: Shared buffer pool Need high bandwidth to buffer (bottleneck) One congested output port could hog all buffer space Option #2: Input buffering #2a) Separate buffer per input port #2b) Separate buffer per virtual channel per input port 51

More Input Buffering If buffer per input port, then could suffer from head of line (HOL) blocking Subsequent packet may be routed to unused output port Either way (#2a or #2b), still likely to need output buffering, but this doesn t need to be divided up by virtual channel 52

Resource Allocation Policies for arbitration for crossbar, output link, etc. Static priority Random Round-robin Oldest-first Effects of adaptive routing? Select output link based on availability Requires feedback from output port 53

For Future Reading We have covered only the tip of the iceberg, and we have hidden most of the complexity Issues we ve brushed under the rug: Physical design of buffers, arbitration logic, etc. Non-crossbar implementations (e.g., using a bus) Control logic for managing switch Using speculation to reduce switch pipeline depth How flow control fits into the switch design Etc. I refer you to the textbook by Dally and Towles for a more comprehensive treatment of this subject 54

Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 55

Case Study Cray T3D 1024 switch nodes each connected to 2 processors 3D torus, bidirectional, 300 MB/s Link: 16 bits, 8 control bits Variable size packet (multiple of 16 bits) Logical request & response networks 2 virtual channels each for deadlock Stacked dimension routing Wormhole for large packets, virtual cut-through for small packets 56

Real (But Old) Machines 57

PRESENTATION Alpha 21364 (EV7) Network 58

PRESENTATION Flattened Butterfly 59