Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup 4) Adaptive route based on network state (e.g., contention) 30
(1) Arithmetic Routing For regular topology, use simple arithmetic to determine route E.g., 3D Torus Packet header contains signed offset to destination (per dimension) At each hop, switch +/- to reduce offset in a dimension When x == 0 and y == 0, then at correct processor (0,1,1) (1,1,1) (0,0,1) (1,0,1) Drawbacks Requires ALU in switch Must re-compute CRC at each hop (0,1,0) (0,0,0) (1,0,0) (1,1,0) 31
(2) Source Based & (3) Table Lookup Routing Source Based Source specifies output port for each switch in route Very simple switches No control state Strip output port off header Myrinet used this Can t be made adaptive Table Lookup Very small header, index into table for output port Big tables, must be kept up-to-date 32
Deterministic vs. Adaptive Routing Deterministic (static) follows a pre-specified route K-ary d-cube: dimension-order routing» (x1, y1) (x2, y2)» First Dx = x2 - x1,» Then Dy = y2 - y1, Tree: common ancestor 010 110 Adaptive route determined by contention for output port 011 100 111 000 001 101 33
(4) Adaptive Routing Essential for fault tolerance At least multipath Can improve utilization of the network Simple deterministic algorithms easily run into bad permutations Fully/partially adaptive, minimal/non-minimal Can introduce complexity or anomalies A little adaptation goes a long way! 34
Hot Potato Routing Every cycle, each switch takes each input and routes it to an output But not necessarily to the desired output No switch buffering! Possibility of livelock if no precautions taken E.g., could grant priority based on age of packet Variants also known as deflection routing or mad postman routing 35
Deadlock Necessary conditions to achieve deadlock Use more than one resource Not willing to release resource in use Cycle in order of recourse use Guri Sohi 36
Two Causes of Deadlock Endpoint Deadlock full of requests Response P1 P2 Response full of requests Switch Deadlock full of messages switch1 switch2 Message M1 Message M2 full of messages 37
Avoiding Deadlock Simple but wasteful solution: full buffering But it s rare that we ever need full buffering More efficient solution: virtual channels (networks) Endpoint deadlock solution: virtual networks Need a virtual network per type of message Switch deadlock solution #1: virtual channels Switch deadlock solution #2: deadlock-free routing 38
Virtual Channels Need some number of virtual channels per virtual network, which depends on network topology and routing scheme Not to be confused with virtual cut-through Add buffers so flits of wormhole packets can be interleaved Optional paper by Dally on virtual channels (see course website) Upshot: total #virtual channels equals product of #virtual networks times #virtual channels for avoiding switch deadlock 39
Up*-Down* Deadlock Free Routing For spanning trees (superposed on any topology) Route up, make one turn, route down Turn Model Routing Restrict order of turns» West first» North last» Negative first Can increase number of hops 40
Minimal turn restrictions in 2D +y -x +x West-first north-last -y negative first 41
Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 42
Circuit Switching Buffer-less Flow Control Establish route then send data Like the telephone system No buffers needed This approach differs from packet switching, which is what we ve implicitly assumed until this slide Hot Potato Routing No buffers needed, since all incoming packets get sent out on some link without waiting Packet Discarding If two packets contend for same resource, one gets dropped relies on higher-order mechanism (retry) 43
Buffered Flow Control Packet switched networks do not reserve bandwidth, which can lead to contention Solution: prevent packets from entering until contention is reduced (e.g., metering lights) Flow control: between pairs of receivers and senders; use feedback to tell the sender when it is allowed to send the next packet Link-level: flow control done on per-link basis End-to-end: flow control done over entire path length 44
Link-Level Flow Control Ready Data Transfer single flit when receiver is ready Could have long links with many flits in flight 45
Credit-based (Window) Flow Control Receiver gives N credits to sender Sender decrements count Stops sending if zero Receiver sends back credit as it drains its buffer Bundle credits to reduce overhead Must account for link latency 46
Water Level High water, low water Stop & go sent back to source switch (Myrinet) Can send redundant stop/go Incoming phits Stop Go Outgoing phits 47
Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 48
A Generic Switch At minimum, must route inputs to outputs Receiver Input Buffer Output Buffer Transmitter Input Ports Crossbar Output Ports Control Routing, Scheduling 49
Switch Operation Each packet (flit) traverses the switch s pipeline Arrive in input buffer and wait to get to head of queue Compute route (once per packet) Allocate virtual channel (once per packet) Allocate crossbar and output buffer entry Traverse crossbar Wait in output buffer to be allocated output link Switch is like very simple in-order processor pipeline Packet can stall at any stage Only head flit, though, can stall when computing route or allocating virtual channel 50
Switch Buffering Must absorb burstiness in arriving traffic Unless using hot potato routing Must also hold flits that are stalled Option #1: Shared buffer pool Need high bandwidth to buffer (bottleneck) One congested output port could hog all buffer space Option #2: Input buffering #2a) Separate buffer per input port #2b) Separate buffer per virtual channel per input port 51
More Input Buffering If buffer per input port, then could suffer from head of line (HOL) blocking Subsequent packet may be routed to unused output port Either way (#2a or #2b), still likely to need output buffering, but this doesn t need to be divided up by virtual channel 52
Resource Allocation Policies for arbitration for crossbar, output link, etc. Static priority Random Round-robin Oldest-first Effects of adaptive routing? Select output link based on availability Requires feedback from output port 53
For Future Reading We have covered only the tip of the iceberg, and we have hidden most of the complexity Issues we ve brushed under the rug: Physical design of buffers, arbitration logic, etc. Non-crossbar implementations (e.g., using a bus) Control logic for managing switch Using speculation to reduce switch pipeline depth How flow control fits into the switch design Etc. I refer you to the textbook by Dally and Towles for a more comprehensive treatment of this subject 54
Outline Topology Routing Flow Control 3 main aspects of networks Designing Switch Hardware Case Studies 55
Case Study Cray T3D 1024 switch nodes each connected to 2 processors 3D torus, bidirectional, 300 MB/s Link: 16 bits, 8 control bits Variable size packet (multiple of 16 bits) Logical request & response networks 2 virtual channels each for deadlock Stacked dimension routing Wormhole for large packets, virtual cut-through for small packets 56
Real (But Old) Machines 57
PRESENTATION Alpha 21364 (EV7) Network 58
PRESENTATION Flattened Butterfly 59