3. Time Switching, Multi-Queue Memories, Shared Buffers, Output Queueing Family

Size: px

Start display at page:

Download "3. Time Switching, Multi-Queue Memories, Shared Buffers, Output Queueing Family"

Franklin Morgan
5 years ago
Views:

1 3. Time Sitching, Multi-Queue Memories, Shared Buffers, Output Queueing Family 3. TDM, Time Sitching, Cut-Through 3. Wide Memories for High Thruput, Segm tn Ovrhd 3.3 Multiple Queues ithin a Buffer Memory 3.4 Queueing for Multicast Traffic 3.5 Shared Buffering and the Output Q ing Family Manolis Katevenis CS-534 Univ. of Crete and FORTH, Greece 3. Wide Memories for High Throughput Table of Contents: 3.. Wide Memories for High Throughput Shared Buffer Sitch using a Wide Memory Pipelined Memory: Optimized Shared Buffer Sitching Generalization: Interleaved Memory Banks 3.. Variable Size Packet Segmentation Overhead Number of memory accesses or crossbar cell-times versus packet-time on the line 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd

2 3.. Shared Buffer Sitch using a Wide Memory N C=N Shared Buffer C=N N Input Ports Output Ports Wide (56 bits) Memory Shared Buffer Example: Gbps/link = 0 MHz bits Memory serves each I/O register for one clock cycle out of every eight Note hidden input and output crossbars; note required cell time alignment 3. - U. Crete - M. Katevenis - CS Wide Memory: Double Buffering Needed in0 cut-through paths in Input need double buffering for the case hen multiple cells complete at the same time If cut-through is desired, separate paths and a crossbar are needed input output 4 Wide-Memory Buffer cut-through crossbar out0 out 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd

3 3.. Optimized Implementation: Pipelined Memory Monolithic ide memory replaced by multiple banks concurrent rd/r access to entire idth replaced by a ave of same-address accesses moving from left to right in0 in output latch Address & Control M0 M M M3 Benefit : no double buffering needed at inputs, single latch at outputs Benefit : cut-through occurs automatically no paths, no control needed input out0 out 3. - U. Crete - M. Katevenis - CS Pipelined Memory Operation Illustrated Ref: Katevenis, Vatsolaki, Efthymiou: Pipelined Memory Shared Buffer for VLSI Sitches, ACM SIGCOMM Conf. 995, U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 3

4 3.. Generalization: Interleaved Memory Banks Effective Xbar Port M Effective Xbar Port S Output Crossbar Input Crossbar multi-bank packet layout bank 0 bank bank bank 3 bank 4 bank 5 bank 6 bank bk N- single bank pck layout bk N- Can e scale arbitrarily ide-memory throughput by increasing its idth? Not past individual packet idth then it becomes something else; see 3. - U. Crete - M. Katevenis - CS Interleaved Mem. Sh.Buf. PPS, Birkhoff-vonNeumann Wide memory idth versus packet size Wide (or pipelined) memory oes its simplicity to the controller generating a single address per cycle for all banks (entire idth) To have multiple packets per line (e.g. red & blue in figure): need independent address controllers per bank (per effective xbar port) Note: infeasible to pack together into the same line packets that both arrive and depart consecutively in time Scheduling interleaved memory accesses against bank conflicts is equivalent to crossbar scheduling under input queueing Scalable Shared-Buffer Sitches through Interleaved Banks: Complexity: maintain distributed queues, forard cells in-order The Parallel Packet Sitch (PPS) Khotimsky, Krishnan: IEEE Int. Conf. Commun. (ICC) 00; Iyer, McKeon: IEEE/ACM ToN, Apr. 003 Birkhoff-vonNeumann C. Chang, W. Chen, H. Huang: Infocom U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 4

5 3.. Variable Size Packet Segmentation Overhead most buffer memories, queueing structures, & crossbars operate on fixed-size segments, blocks, or cells most appl ns use variable size packets prev. pck. line overhead interpacket gap not ritten to memory (e.g. CRC, preamble,...) cost (duration) on-the-line this packet buf.ovrhd buffer idth or segment size or cell size (time) next packet time cost in the buffer or in the crossbar Packets segmented into blocks/cells upon entry from line to sitch Throughput of buffer memories & crossbars, measured in terms of cells, must be higher than line thruput, to compensate for segment n overhead 3. - U. Crete - M. Katevenis - CS Overspeed req d to compensate for Segmentation Example: min. pck. size = 40 B segment (cell) = 64 B line overhead = 6 B Rules of thumb: For line overhead = 0 overspeed = often.0 but check min.pck.sz.: if minimum packet is too small (smaller than about half a cell), then higher overspeed is required for line overhead > 0 overspeed <.0, but check minimum packets for very large line ovrhd check larger pck sizes Buffer/Crossbar Cost (Bytes) one cell second cell third cell minimum pck size slope = slope = overspeed = ( cl) / ( cl + ln ovrhd) = 8/ one cell second cell line overhead Line Cost (Bytes) 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 5

4. Time-Space Switching and the family of Input Queueing Architectures

4. Time-Space Switching and the family of Input Queueing Architectures 4.1 Intro: Time-Space Sw.; Input Q ing w/o VOQ s 4.2 Input Queueing with VOQ s: Crossbar Scheduling 4.3 Adv.Topics: Pipelined, Packet,