Phastlane: A Rapid Transit Optical Routing Network

Size: px

Start display at page:

Download "Phastlane: A Rapid Transit Optical Routing Network"

Calvin Cooper
5 years ago
Views:

1 Phastlane: A Rapid Transit Optical Routing Network Mark Cianchetti, Joseph Kerekes, and David Albonesi Computer Systems Laboratory Cornell University

The Interconnect Bottleneck Future processors: tens to hundreds of cores

Nanophotonics Ultra-fast signal propagation High bandwidth density Low

10001101111100101010111100001 11011101110011100000110101000

2 The Interconnect Bottleneck Future processors: tens to hundreds of cores Dire need for fast and power efficient on-chip interconnect Nanophotonics Ultra-fast signal propagation High bandwidth density Low power O/E Translation Electrical Input Data V input - E/O Translation Photonic System Components Optical Coupler

3 Limitations of Nanophotonics Lack of fundamental building blocks Logic gates Memory structures Single routing layer Power loss of waveguide crossings Traditional Approach To Exploiting Photonics Source Node Direct Bus Path Upon formation, unimpeded data blasted from source to destination Destination Node

Respecting the Limitations Electrical Network Mesh Column Optical Network Dest Node Cornell Ring Architecture Columbia Architecture Corona Architecture

4 Respecting the Limitations Electrical Network Mesh Column Optical Network Dest Node Cornell Ring Architecture Columbia Architecture Corona Architecture FireFly Architecture Previous proposals largely bus-based Direct photonic links between source and destination Data blasted in a single optical transmission

5 Overview of Previous Work Feature Cornell Ring Corona Columbia Phastlane Network Topology Ring-bus Snake XBAR Torus Mesh Network Operation Snoopy Bus Fully Connected XBAR Electrical Setup Packet Switched Shared Resources None Destination Bus Router Channels Router Channels Shared Resource Arb. WDM Token Per Hop Per Hop Unit of Data Transfer Cache Line Cache Line >> Cache Line Cache Line

6 Phastlane Contributions Novel nanophotonic router architecture Hybrid optical/electrical packet-switched, mesh network Nanophotonics: High-speed, multi-hop packet transfers Electrical: Logic and buffering to handle packet conflicts

7 Phastlane Architecture Routing Arbitration Transmit: Node 0 to Node 10 Minimally impede flow of light from source to destination Simplified routing, arbitration Dual die configuration

8 Optical Node Operation Crossbar Resonators Optical Router

9 Electrical Node Operation Blocked packets are buffered locally Per port buffers, single processor buffer Output multiplexers connect to output port optical transmitters Electrical Node

10 Simplified Packet Routing XY Dimension-Order Per Hop Packet Control Hop Number Direction Encoding R Local R Straight R Straight R Left Straight R2 R Straight Source Routing Data/Address/Other Packets are dimension-order, source routed Portion of packet holds pre-computed routing bits Per hop control bits enable near-instant switch traversal

11 Per Hop Control Bits Hop Number R6 R5 R4 R3 R2 R1 Direction Encoding Local Straight Straight Straight Left Straight R 4 R 3 Frequency shift Empty Control Waveguide Router R1 examines control bits Per hop control groups set switch resonators Left, right, straight, local, broadcast Frequency translation Enables hop oblivious routing Left Right Straight Local Broadcast

12 Per Hop Control Bits Continued C1 C0 C0 C1 Control waveguide split into C0 and C1 WDM limitations Frequency and physical translation Enables hop oblivious routing

13 Competing Packets: Resolution N W E Packet Received S Two packets collide Collision resolution Packet collisions avoided by packet blocking Collision resolution performed on the fly

14 Drop Node N C1o C0o D2o D1o D00 D0 i D1 i D2i C0i C1i C1o C1o C1 o C0o C0o C0 o D2o D2o D2 o D1o D1o D1 o D00 D00 E D0 0 D0i D1i D2i C0i C1i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i D0i D1i D2i C0i C1i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i D0 i D1 i D2 i C0 i C1 i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i C1i C0i D2i D1i D0i D00 D1o D2o C0o C1o D0i D1i C0i D0i D1i C0i N C1o C0o D2o D1o D00 D0 i D1 i D2i C0i C1i C1o C1 o C1 o C0o C0 o C0 o D2o D2 o D2 o D1 o D1 o E High-Speed Drop Network D0 i D1 i C0i C1i N N D0 i D1 C1o C0o D2o D1o D00 D2i C1o C0o D2o D1o D00 i D2i C0i C1i W D0 0 D0 0 D0 i D1 i D2 i C0 i C1 i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i D0 i D1 i D2 i C0 i C1 i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i D0i D1i C0i D0i D1i C0i E W E E W C1i D2i D00 D1o D2o C0o C1o N C1o C0o D2o D1o D00 D0 i D1 i D2i C0i C1i C1i D2i D00 D1o D2o C0o C1o N C1o C0o D2o D1o D00 D0 i D1 i D2i C0i C1i W D1o D00 D0i D1i D2i C0i C1i C1 o C0o D2o D1o D00 D0i D1i D2i C0i C1i C1i C0i D2i D1i D0i D00 D1o D2o C0o C1o C1o C0o D00 D0i C0i C1i C1 o C0 o D0 0 D0 i C0 i C1 i C1o C0o D0o D0i C0i C1i E W W C1i C0i D0i D0 0 C0o C1o C1i D2i D00 D1o D2o C0o C1o C1i D2i D00 D1o D2o C0o C1o Source Drop path gradually formed as packet passes through network Packet dropped if downstream buffer full Drop signal travels opposite data packet

15 Router Design Space Exploration

16 Evaluation Methodology Baseline Configuration XBAR 64 node, 8x8 mesh network topology Low latency, high bandwidth electrical network baseline Multi-port receive for destination packets Packet size: 80 bytes, single flit Synthetic and Splash benchmarks Baseline Node

Synthetic Benchmark Results 50 40 Bit Comp 50 40 Bit Rev Latency 30 20 10 8X Latency 30 20 10 6X 0 0 0.05 0.1 0.15 0.

17 Synthetic Benchmark Results Bit Comp Bit Rev Latency X Latency X Offered Traffic Offered Traffic Shuffle Transpose Latency X Offered Traffic Latency X Offered Traffic

18 Splash Performance Analysis 2X speedup across all benchmarks Electrical Optical Optical 64 Entry Buffers Optical Inf Buffers Network Speedup 0 Barnes Cholesky FFT Lu Ocean Radix Raytrace WaterNSq Waterspatial FMM

19 Splash Power Analysis 80% reduction in power across all benchmarks Electrical Optical Optical 64 Entry Buffers Relative Power Barnes Cholesky FFT Lu Ocean Radix Raytrace WaterNSq Waterspatial FMM

20 Conclusions Novel nanophotonic router architecture Packet-switched, hybrid optical/electrical mesh network Up to 8X performance improvement for synthetic workloads Up to 4X performance improvement, 80% power reduction, for Splash Future work Lower power broadcast scheme Improved allocation and flow control

Brief Background in Fiber Optics

The Future of Photonics in Upcoming Processors ECE 4750 Fall 08 Brief Background in Fiber Optics Light can travel down an optical fiber if it is completely confined Determined by Snells Law Various modes