3. Time Switching, Multi-Queue Memories, Shared Buffers, Output Queueing Family

Size: px
Start display at page:

Download "3. Time Switching, Multi-Queue Memories, Shared Buffers, Output Queueing Family"

Transcription

1 3. Time Sitching, Multi-Queue Memories, Shared Buffers, Output Queueing Family 3. TDM, Time Sitching, Cut-Through 3. Wide Memories for High Thruput, Segm tn Ovrhd 3.3 Multiple Queues ithin a Buffer Memory 3.4 Queueing for Multicast Traffic 3.5 Shared Buffering and the Output Q ing Family Manolis Katevenis CS-534 Univ. of Crete and FORTH, Greece 3. Wide Memories for High Throughput Table of Contents: 3.. Wide Memories for High Throughput Shared Buffer Sitch using a Wide Memory Pipelined Memory: Optimized Shared Buffer Sitching Generalization: Interleaved Memory Banks 3.. Variable Size Packet Segmentation Overhead Number of memory accesses or crossbar cell-times versus packet-time on the line 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd

2 3.. Shared Buffer Sitch using a Wide Memory N C=N Shared Buffer C=N N Input Ports Output Ports Wide (56 bits) Memory Shared Buffer Example: Gbps/link = 0 MHz bits Memory serves each I/O register for one clock cycle out of every eight Note hidden input and output crossbars; note required cell time alignment 3. - U. Crete - M. Katevenis - CS Wide Memory: Double Buffering Needed in0 cut-through paths in Input need double buffering for the case hen multiple cells complete at the same time If cut-through is desired, separate paths and a crossbar are needed input output 4 Wide-Memory Buffer cut-through crossbar out0 out 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd

3 3.. Optimized Implementation: Pipelined Memory Monolithic ide memory replaced by multiple banks concurrent rd/r access to entire idth replaced by a ave of same-address accesses moving from left to right in0 in output latch Address & Control M0 M M M3 Benefit : no double buffering needed at inputs, single latch at outputs Benefit : cut-through occurs automatically no paths, no control needed input out0 out 3. - U. Crete - M. Katevenis - CS Pipelined Memory Operation Illustrated Ref: Katevenis, Vatsolaki, Efthymiou: Pipelined Memory Shared Buffer for VLSI Sitches, ACM SIGCOMM Conf. 995, U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 3

4 3.. Generalization: Interleaved Memory Banks Effective Xbar Port M Effective Xbar Port S Output Crossbar Input Crossbar multi-bank packet layout bank 0 bank bank bank 3 bank 4 bank 5 bank 6 bank bk N- single bank pck layout bk N- Can e scale arbitrarily ide-memory throughput by increasing its idth? Not past individual packet idth then it becomes something else; see 3. - U. Crete - M. Katevenis - CS Interleaved Mem. Sh.Buf. PPS, Birkhoff-vonNeumann Wide memory idth versus packet size Wide (or pipelined) memory oes its simplicity to the controller generating a single address per cycle for all banks (entire idth) To have multiple packets per line (e.g. red & blue in figure): need independent address controllers per bank (per effective xbar port) Note: infeasible to pack together into the same line packets that both arrive and depart consecutively in time Scheduling interleaved memory accesses against bank conflicts is equivalent to crossbar scheduling under input queueing Scalable Shared-Buffer Sitches through Interleaved Banks: Complexity: maintain distributed queues, forard cells in-order The Parallel Packet Sitch (PPS) Khotimsky, Krishnan: IEEE Int. Conf. Commun. (ICC) 00; Iyer, McKeon: IEEE/ACM ToN, Apr. 003 Birkhoff-vonNeumann C. Chang, W. Chen, H. Huang: Infocom U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 4

5 3.. Variable Size Packet Segmentation Overhead most buffer memories, queueing structures, & crossbars operate on fixed-size segments, blocks, or cells most appl ns use variable size packets prev. pck. line overhead interpacket gap not ritten to memory (e.g. CRC, preamble,...) cost (duration) on-the-line this packet buf.ovrhd buffer idth or segment size or cell size (time) next packet time cost in the buffer or in the crossbar Packets segmented into blocks/cells upon entry from line to sitch Throughput of buffer memories & crossbars, measured in terms of cells, must be higher than line thruput, to compensate for segment n overhead 3. - U. Crete - M. Katevenis - CS Overspeed req d to compensate for Segmentation Example: min. pck. size = 40 B segment (cell) = 64 B line overhead = 6 B Rules of thumb: For line overhead = 0 overspeed = often.0 but check min.pck.sz.: if minimum packet is too small (smaller than about half a cell), then higher overspeed is required for line overhead > 0 overspeed <.0, but check minimum packets for very large line ovrhd check larger pck sizes Buffer/Crossbar Cost (Bytes) one cell second cell third cell minimum pck size slope = slope = overspeed = ( cl) / ( cl + ln ovrhd) = 8/ one cell second cell line overhead Line Cost (Bytes) 3. - U. Crete - M. Katevenis - CS Wide Memories, Segm'tn Ovrhd 5

4. Time-Space Switching and the family of Input Queueing Architectures

4. Time-Space Switching and the family of Input Queueing Architectures 4. Time-Space Switching and the family of Input Queueing Architectures 4.1 Intro: Time-Space Sw.; Input Q ing w/o VOQ s 4.2 Input Queueing with VOQ s: Crossbar Scheduling 4.3 Adv.Topics: Pipelined, Packet,

More information

CS-534 Packet Switch Architecture

CS-534 Packet Switch Architecture CS-534 Packet Switch Architecture The Hardware Architect s Perspective on High-Speed Networking and Interconnects Manolis Katevenis University of Crete and FORTH, Greece http://archvlsi.ics.forth.gr/~kateveni/534

More information

2. Link and Memory Architectures and Technologies

2. Link and Memory Architectures and Technologies 2. Link and Memory Architectures and Technologies 2.1 Links, Thruput/Buffering, Multi-Access Ovrhds 2.2 Memories: On-chip / Off-chip SRAM, DRAM 2.A Appendix: Elastic Buffers for Cross-Clock Commun. Manolis

More information

CS-534 Packet Switch Architecture

CS-534 Packet Switch Architecture CS-534 Packet Switch rchitecture The Hardware rchitect s Perspective on High-Speed etworking and Interconnects Manolis Katevenis University of Crete and FORTH, Greece http://archvlsi.ics.forth.gr/~kateveni/534.

More information

2. Link and Memory Architectures and Technologies

2. Link and Memory Architectures and Technologies 2. Link and Memory Architectures and Technologies 2.1 Links, Thruput/Buffering, Multi-Access Ovrhds 2.2 Memories: On-chip / Off-chip SRAM, DRAM 2.A Appendix: Elastic Buffers for Cross-Clo Commun. Manolis

More information

6.2 Per-Flow Queueing and Flow Control

6.2 Per-Flow Queueing and Flow Control 6.2 Per-Flow Queueing and Flow Control Indiscriminate flow control causes local congestion (output contention) to adversely affect other, unrelated flows indiscriminate flow control causes phenomena analogous

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Design and Implementation of a Shared Memory Switch Fabric

Design and Implementation of a Shared Memory Switch Fabric 6'th International Symposium on Telecommunications (IST'2012) Design and Implementation of a Shared Memory Switch Fabric Mina Ejlali M.ejlalikamorin@ec.iut.ac.ir Mohammad Ali Montazeri Montazeri@cc.iut.ac.ir

More information

2. Link and Memory Architectures and Technologies

2. Link and Memory Architectures and Technologies 2. Link and Memory Architectures and Technologies 2.1 Links, Thruput/Buffering, Multi-Access Ovrhds 2.2 Memories: On-chip / Off-chip SRAM, DRAM 2.A Appendix: Elastic Buffers for Cross-Clock Commun. Manolis

More information

Parallel Packet Switching using Multiplexors with Virtual Input Queues

Parallel Packet Switching using Multiplexors with Virtual Input Queues Parallel Packet Switching using Multiplexors with Virtual Input Queues Ahmed Aslam Kenneth J. Christensen Department of Computer Science and Engineering University of South Florida Tampa, Florida 33620

More information

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda biswap@cse.iitk.ac.in https://www.cse.iitk.ac.in/users/biswap/cs698y.html Row decoder Accessing a Row Access Address

More information

Variable-Size Multipacket Segments in Buffered Crossbar (CICQ) Architectures

Variable-Size Multipacket Segments in Buffered Crossbar (CICQ) Architectures Variable-Size Multipacket Segments in Buffered Crossbar (CICQ) Architectures Manolis Katevenis and Georgios Passas Institute of Computer Science (ICS), Foundation for Research and Technology - Hellas (FORTH)

More information

A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports

A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports Takeshi Shimizu, Yukihiro Nakagawa, Sridhar Pathi, Yasushi Umezawa, Takashi Miyoshi, Yoichi Koyanagi, Takeshi Horie, Akira Hattori Hot

More information

Parallelism in Network Systems

Parallelism in Network Systems High Performance Switching Telecom Center Workshop: and outing Sept 4, 997. Parallelism in Network Systems Joint work with Sundar Iyer HP Labs, 0 th September, 00 Nick McKeown Professor of Electrical Engineering

More information

Vector an ordered series of scalar quantities a one-dimensional array. Vector Quantity Data Data Data Data Data Data Data Data

Vector an ordered series of scalar quantities a one-dimensional array. Vector Quantity Data Data Data Data Data Data Data Data Vector Processors A vector processor is a pipelined processor with special instructions designed to keep the (floating point) execution unit pipeline(s) full. These special instructions are vector instructions.

More information

OpenFlow based Flow Level Bandwidth Provisioning for CICQ Switches

OpenFlow based Flow Level Bandwidth Provisioning for CICQ Switches OpenFlow based Flow Level Bandwidth Provisioning for CICQ Switches Hao Jin, Deng Pan, Jason Liu, and Niki Pissinou Florida International University Abstract Flow level bandwidth provisioning offers fine

More information

Snoop-Based Multiprocessor Design III: Case Studies

Snoop-Based Multiprocessor Design III: Case Studies Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

Packet Switch Architectures Part 2

Packet Switch Architectures Part 2 Packet Switch Architectures Part Adopted from: Sigcomm 99 Tutorial, by Nick McKeown and Balaji Prabhakar, Stanford University Slides used with permission from authors. 999-000. All rights reserved by authors.

More information

Efficient Queuing Architecture for a Buffered Crossbar Switch

Efficient Queuing Architecture for a Buffered Crossbar Switch Proceedings of the 11th WSEAS International Conference on COMMUNICATIONS, Agios Nikolaos, Crete Island, Greece, July 26-28, 2007 95 Efficient Queuing Architecture for a Buffered Crossbar Switch MICHAEL

More information

ECE 697J Advanced Topics in Computer Networks

ECE 697J Advanced Topics in Computer Networks ECE 697J Advanced Topics in Computer Networks Switching Fabrics 10/02/03 Tilman Wolf 1 Router Data Path Last class: Single CPU is not fast enough for processing packets Multiple advanced processors in

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics Manolis Katevenis, Nikolaos Chrysos FORTH

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics Manolis Katevenis, Nikolaos Chrysos FORTH

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics Manolis Katevenis, Nikolaos Chrysos FORTH

More information

CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK

CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK IADIS International Conference on Applied Computing CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK Ahmad.H. ALqerem Dept. of Comp. Science ZPU Zarka Private University Zarka Jordan ABSTRACT Omega

More information

Cell Switching (ATM) Commonly transmitted over SONET other physical layers possible. Variable vs Fixed-Length Packets

Cell Switching (ATM) Commonly transmitted over SONET other physical layers possible. Variable vs Fixed-Length Packets Cell Switching (ATM) Connection-oriented packet-switched network Used in both WAN and LAN settings Signaling (connection setup) Protocol: Q2931 Specified by ATM forum Packets are called cells 5-byte header

More information

The Network Layer and Routers

The Network Layer and Routers The Network Layer and Routers Daniel Zappala CS 460 Computer Networking Brigham Young University 2/18 Network Layer deliver packets from sending host to receiving host must be on every host, router in

More information

Packet Switching. Hongwei Zhang Nature seems to reach her ends by long circuitous routes.

Packet Switching. Hongwei Zhang  Nature seems to reach her ends by long circuitous routes. Problem: not all networks are directly connected Limitations of directly connected networks: limit on the number of hosts supportable limit on the geographic span of the network Packet Switching Hongwei

More information

COMP Parallel Computing. SMM (1) Memory Hierarchies and Shared Memory

COMP Parallel Computing. SMM (1) Memory Hierarchies and Shared Memory COMP 633 - Parallel Computing Lecture 6 September 6, 2018 SMM (1) Memory Hierarchies and Shared Memory 1 Topics Memory systems organization caches and the memory hierarchy influence of the memory hierarchy

More information

Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches

Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches Rajendra V. Boppana Computer Science Division The Univ. of Texas at San Antonio San Antonio, TX 829- boppana@cs.utsa.edu C. S. Raghavendra

More information

L2 cache provides additional on-chip caching space. L2 cache captures misses from L1 cache. Summary

L2 cache provides additional on-chip caching space. L2 cache captures misses from L1 cache. Summary HY425 Lecture 13: Improving Cache Performance Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS November 25, 2011 Dimitrios S. Nikolopoulos HY425 Lecture 13: Improving Cache Performance 1 / 40

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

New Architecture for Code-Shadowing Applications. Anil Gupta Technical Executive, Winbond Electronics

New Architecture for Code-Shadowing Applications. Anil Gupta Technical Executive, Winbond Electronics New Architecture for Code-Shadowing Applications Dual-buffer SPI-NAND Arch providing SPI-NOR compatibility Anil Gupta Technical Executive, Winbond Electronics Santa Clara, CA 1 Key features of New Architecture

More information

Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks

Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks Yuhua Chen Jonathan S. Turner Department of Electrical Engineering Department of Computer Science Washington University Washington University

More information

Lecture 7: Flow Control - I

Lecture 7: Flow Control - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 7: Flow Control - I Tushar Krishna Assistant Professor School of Electrical

More information

DDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site:

DDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site: DDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site: www.micron.com 256MB, 512MB (x64, SR) 184-Pin DDR SDRAM UDIMM Features Features 184-pin, unbuffered

More information

Parallel Programming Platforms

Parallel Programming Platforms arallel rogramming latforms Ananth Grama Computing Research Institute and Department of Computer Sciences, urdue University ayg@cspurdueedu http://wwwcspurdueedu/people/ayg Reference: Introduction to arallel

More information

Queue and Flow Control Architectures for Interconnection Switches

Queue and Flow Control Architectures for Interconnection Switches Queue and low Control rchitectures for Interconnection Switches. asic Concepts and Queueing rchitectures. igh-throughput Multi-Queue Memories. Crossbars, Scheduling, & Combination Queueing. low and Congestion

More information

ispgdx2 vs. ispgdx Architecture Comparison

ispgdx2 vs. ispgdx Architecture Comparison isp2 vs. isp July 2002 Technical Note TN1035 Introduction The isp2 is the second generation of Lattice s successful isp platform. Architecture enhancements improve flexibility and integration when implementing

More information

Curbing Aggregate Member Flow Burstiness to Bound End-to-End Delay in Networks of TDMA Crossbar Real-Time Switches

Curbing Aggregate Member Flow Burstiness to Bound End-to-End Delay in Networks of TDMA Crossbar Real-Time Switches Curbing Aggregate Member Flow Burstiness to Bound End-to-End Delay in Networks of TDMA Crossbar Real-Time Switches Qixin Wang *, Yufei Wang *, Rong Zheng,*, Xue Liu * Dept. of Computing, the Hong Kong

More information

Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov

Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov I Dynamic Buffer Organization Methods for Interconnection Network Switches Amit Kumar Gupta, Francois Labonte, Paul Wang Lee, Alex Solomatnikov I. INTRODUCTION nterconnection networks originated from the

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Performance Optimization Part II: Locality, Communication, and Contention

Performance Optimization Part II: Locality, Communication, and Contention Lecture 7: Performance Optimization Part II: Locality, Communication, and Contention Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Beth Rowley Nobody s Fault but Mine

More information

SCI Basics. Hugo Kohmann

SCI Basics. Hugo Kohmann Hugo Kohmann Slide 1 What is SCI ½ The SCI standard ANSI/IEEE 1596-1992 defines a point-to-point interface and a set of packet protocols. ½ The SCI protocols use small packet sizes with a 16-byte header

More information

Chapter-5 Memory Hierarchy Design

Chapter-5 Memory Hierarchy Design Chapter-5 Memory Hierarchy Design Unlimited amount of fast memory - Economical solution is memory hierarchy - Locality - Cost performance Principle of locality - most programs do not access all code or

More information

DDR SDRAM SODIMM MT8VDDT1664H 128MB 1. MT8VDDT3264H 256MB 2 MT8VDDT6464H 512MB For component data sheets, refer to Micron s Web site:

DDR SDRAM SODIMM MT8VDDT1664H 128MB 1. MT8VDDT3264H 256MB 2 MT8VDDT6464H 512MB For component data sheets, refer to Micron s Web site: SODIMM MT8VDDT1664H 128MB 1 128MB, 256MB, 512MB (x64, SR) 200-Pin SODIMM Features MT8VDDT3264H 256MB 2 MT8VDDT6464H 512MB For component data sheets, refer to Micron s Web site: www.micron.com Features

More information

Design and Simulation of Router Using WWF Arbiter and Crossbar

Design and Simulation of Router Using WWF Arbiter and Crossbar Design and Simulation of Router Using WWF Arbiter and Crossbar M.Saravana Kumar, K.Rajasekar Electronics and Communication Engineering PSG College of Technology, Coimbatore, India Abstract - Packet scheduling

More information

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues D.N. Serpanos and P.I. Antoniadis Department of Computer Science University of Crete Knossos Avenue

More information

Hardware Acceleration

Hardware Acceleration Hardware Acceleration Sungho Kang Yonsei University Outline Introduction Boeing TEGAS Yorktown Simulation Engine Logic Simulation Machine HAL ZYCAD AAP-1 Reconfigurable 2 Why Simulation Engine Speed up

More information

Multiprocessor Interconnection Networks

Multiprocessor Interconnection Networks Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical

More information

Scheduling Modes. Merging multiple Xena Traffic Generator streams onto one physical port

Scheduling Modes. Merging multiple Xena Traffic Generator streams onto one physical port APPLICATION NOTE Scheduling Modes Merging multiple Xena Traffic Generator streams onto one physical port A Traffic Generator from Xena Networks provides very flexible options for generation of Ethernet

More information

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off Crossbar Crossbar - example Simple space-division switch Crosspoints can be turned on or off i n p u t s sessions: (,) (,) (,) (,) outputs Crossbar Advantages: simple to implement simple control flexible

More information

Lecture: Memory Technology Innovations

Lecture: Memory Technology Innovations Lecture: Memory Technology Innovations Topics: memory schedulers, refresh, state-of-the-art and upcoming changes: buffer chips, 3D stacking, non-volatile cells, photonics Multiprocessor intro 1 Row Buffers

More information

Optimizing Memory Bandwidth of a Multi-Channel Packet Buffer

Optimizing Memory Bandwidth of a Multi-Channel Packet Buffer Optimizing Memory Bandwidth of a Multi-Channel Packet Buffer Sarang Dharmapurikar Sailesh Kumar John Lockwood Patrick Crowley Dept. of Computer Science and Engg. Washington University in St. Louis, USA.

More information

Mobile Communications Chapter 7: Wireless LANs

Mobile Communications Chapter 7: Wireless LANs Characteristics IEEE 802.11 PHY MAC Roaming IEEE 802.11a, b, g, e HIPERLAN Bluetooth Comparisons Prof. Dr.-Ing. Jochen Schiller, http://www.jochenschiller.de/ MC SS02 7.1 Comparison: infrastructure vs.

More information

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam

More information

A Probabilistic Approach for Achieving Fair Bandwidth Allocations in CSFQ

A Probabilistic Approach for Achieving Fair Bandwidth Allocations in CSFQ A Probabilistic Approach for Achieving Fair Bandwidth Allocations in Peng Wang David L. Mills Department of Electrical & Computer Engineering University of Delaware Newark, DE 976 pwangee@udel.edu; mills@eecis.udel.edu

More information

CS 557 Congestion and Complexity

CS 557 Congestion and Complexity CS 557 Congestion and Complexity Observations on the Dynamics of a Congestion Control Algorithm: The Effects of Two-Way Traffic Zhang, Shenker, and Clark, 1991 Spring 2013 The Story So Far. Transport layer:

More information

1 Introduction

1 Introduction Published in IET Communications Received on 17th September 2009 Revised on 3rd February 2010 ISSN 1751-8628 Multicast and quality of service provisioning in parallel shared memory switches B. Matthews

More information

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache Taking advantage of spatial locality Use block size larger than one word Example: two words Block index tag () () Alternate representations Word index tag block, word block, word block, word block, word

More information

HWP2 Application level query routing HWP1 Each peer knows about every other beacon B1 B3

HWP2 Application level query routing HWP1 Each peer knows about every other beacon B1 B3 HWP2 Application level query routing HWP1 Each peer knows about every other beacon B2 B1 B3 B4 B5 B6 11-Feb-02 Computer Networks 1 HWP2 Query routing searchget(searchkey, hopcount) Rget(host, port, key)

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #8 2/7/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

Interconnection Networks

Interconnection Networks Lecture 17: Interconnection Networks Parallel Computer Architecture and Programming A comment on web site comments It is okay to make a comment on a slide/topic that has already been commented on. In fact

More information

Options. Data Rate (MT/s) CL = 3 CL = 2.5 CL = 2-40B PC PC PC

Options. Data Rate (MT/s) CL = 3 CL = 2.5 CL = 2-40B PC PC PC DDR SDRAM UDIMM MT16VDDF6464A 512MB 1 MT16VDDF12864A 1GB 1 For component data sheets, refer to Micron s Web site: www.micron.com 512MB, 1GB (x64, DR) 184-Pin DDR SDRAM UDIMM Features Features 184-pin,

More information

Data Center Traffic and Measurements: SoNIC

Data Center Traffic and Measurements: SoNIC Center Traffic and Measurements: SoNIC Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and ing November 12, 2014 Slides from USENIX symposium on ed Systems

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

SigmaRAM Echo Clocks

SigmaRAM Echo Clocks SigmaRAM Echo s AN002 Introduction High speed, high throughput cell processing applications require fast access to data. As clock rates increase, the amount of time available to access and register data

More information

2 Network Basics. types of communication service. how communication services are implemented. network performance measures. switching.

2 Network Basics. types of communication service. how communication services are implemented. network performance measures. switching. 2 Network Basics types of communication service how communication services are implemented switching multiplexing network performance measures 1 2.1 Types of service in a layered network architecture connection-oriented:

More information

Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China

Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China CMOS Crossbar Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China OUTLINE Motivations Problems of Designing Large Crossbar Our Approach - Pipelined MUX

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 14 EE141 Outline Parallelism EE141 2 Parallelism Parallelism is the act of doing more

More information

Router Architectures

Router Architectures Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat

More information

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK DOI: 10.21917/ijct.2012.0092 HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK U. Saravanakumar 1, R. Rangarajan 2 and K. Rajasekar 3 1,3 Department of Electronics and Communication

More information

Dynamic Types, Concurrency, Type and effect system Section and Practice Problems Apr 24 27, 2018

Dynamic Types, Concurrency, Type and effect system Section and Practice Problems Apr 24 27, 2018 Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Apr 24 27, 2018 1 Dynamic types and contracts (a) To make sure you understand the operational semantics of dynamic types

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

DDR SDRAM UDIMM. Draft 9/ 9/ MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site:

DDR SDRAM UDIMM. Draft 9/ 9/ MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site: DDR SDRAM UDIMM MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site: www.micron.com 512MB, 1GB (x72, ECC, DR) 184-Pin DDR SDRAM UDIMM Features Features 184-pin,

More information

Basic I/O Interface

Basic I/O Interface Basic I/O Interface - 8255 11 3 THE PROGRAMMABLE PERIPHERAL 82C55 programmable peripheral interface (PPI) is a popular, low-cost interface component found in many applications. The PPI has 24 pins for

More information

DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB

DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB For component data sheets, refer to Micron s Web site: www.micron.com 512MB, 1GB, 2GB (x64, DR) 184-Pin DDR SDRAM UDIMM Features

More information

Comparative Study of blocking mechanisms for Packet Switched Omega Networks

Comparative Study of blocking mechanisms for Packet Switched Omega Networks Proceedings of the 6th WSEAS Int. Conf. on Electronics, Hardware, Wireless and Optical Communications, Corfu Island, Greece, February 16-19, 2007 18 Comparative Study of blocking mechanisms for Packet

More information

1. The values of t RCD and t RP for -335 modules show 18ns to align with industry specifications; actual DDR SDRAM device specifications are 15ns.

1. The values of t RCD and t RP for -335 modules show 18ns to align with industry specifications; actual DDR SDRAM device specifications are 15ns. UDIMM MT4VDDT1664A 128MB MT4VDDT3264A 256MB For component data sheets, refer to Micron s Web site: www.micron.com 128MB, 256MB (x64, SR) 184-Pin UDIMM Features Features 184-pin, unbuffered dual in-line

More information

Quality-of-Service for a High-Radix Switch

Quality-of-Service for a High-Radix Switch Quality-of-Service for a High-Radix Switch Nilmini Abeyratne, Supreet Jeloka, Yiping Kang, David Blaauw, Ronald G. Dreslinski, Reetuparna Das, and Trevor Mudge University of Michigan 51 st DAC 06/05/2014

More information

A Key Theme of CIS 371: Parallelism. CIS 371 Computer Organization and Design. Readings. This Unit: (In-Order) Superscalar Pipelines

A Key Theme of CIS 371: Parallelism. CIS 371 Computer Organization and Design. Readings. This Unit: (In-Order) Superscalar Pipelines A Key Theme of CIS 371: arallelism CIS 371 Computer Organization and Design Unit 10: Superscalar ipelines reviously: pipeline-level parallelism Work on execute of one instruction in parallel with decode

More information

I/O, today, is Remote (block) Load/Store, and must not be slower than Compute, any more

I/O, today, is Remote (block) Load/Store, and must not be slower than Compute, any more I/O, today, is Remote (block) Load/Store, and must not be slower than Compute, any more Manolis Katevenis FORTH, Heraklion, Crete, Greece (in collab. with Univ. of Crete) http://www.ics.forth.gr/carv/

More information

Introduction to Communications Part One: Physical Layer Switching

Introduction to Communications Part One: Physical Layer Switching Introduction to Communications Part One: Physical Layer Switching Kuang Chiu Huang TCM NCKU Spring/2008 Goals of This Lecture Through the lecture and in-class discussion, students are enabled to compare

More information

H-MMAC: A Hybrid Multi-channel MAC Protocol for Wireless Ad hoc Networks

H-MMAC: A Hybrid Multi-channel MAC Protocol for Wireless Ad hoc Networks H-: A Hybrid Multi-channel MAC Protocol for Wireless Ad hoc Networks Duc Ngoc Minh Dang Department of Computer Engineering Kyung Hee University, Korea Email: dnmduc@khu.ac.kr Choong Seon Hong Department

More information

The Impact of the DOCSIS 1.1/2.0 MAC Protocol on TCP

The Impact of the DOCSIS 1.1/2.0 MAC Protocol on TCP The Impact of the DOCSIS 11/20 MAC Protocol on TCP Jim Martin Department of Computer Science Clemson University Clemson, SC 29634-0974 jimmartin@csclemsonedu Abstract-- The number of broadband cable access

More information

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors Brent Bohnenstiehl and Bevan Baas Department of Electrical and Computer Engineering University of California, Davis {bvbohnen,

More information

IBM PowerPRS TM. A Scalable Switch Fabric to Multi- Terabit: Architecture and Challenges. July, Speaker: François Le Maut. IBM Microelectronics

IBM PowerPRS TM. A Scalable Switch Fabric to Multi- Terabit: Architecture and Challenges. July, Speaker: François Le Maut. IBM Microelectronics IBM PowerPRS TM A Scalable Switch Fabric to Multi- Terabit: Architecture and Challenges July, 2002 Speaker: François Le Maut IBM Microelectronics Outline Introduction General Architecture Flow Control

More information

A Deficit Round Robin with Fragmentation Scheduler for Mobile WiMAX

A Deficit Round Robin with Fragmentation Scheduler for Mobile WiMAX A Deficit Round Robin with Fragmentation Scheduler for Mobile WiMAX Chakchai So-In, Raj Jain and Abdel-Karim Al Tammi Washington University in Saint Louis Saint Louis, MO 63130 jain@cse.wustl.edu Presentation

More information

Top-Down Network Design

Top-Down Network Design Top-Down Network Design Chapter Two Analyzing Technical Goals and Tradeoffs Original slides by Cisco Press & Priscilla Oppenheimer Scalability Availability Performance Accuracy Security Manageability Usability

More information

CS 123: Lecture 12, LANs, and Ethernet. George Varghese. October 24, 2006

CS 123: Lecture 12, LANs, and Ethernet. George Varghese. October 24, 2006 CS 123: Lecture 12, LANs, and Ethernet George Varghese October 24, 2006 Selective Reject Modulus failure Example w = 2, Max = 3 0 0 1 3 0 A(1) A(2) 1 0 retransmit A(1) A(2) buffer Case 1 Case 2 reject

More information

Enhanced Forward Explicit Congestion Notification (E-FECN) Scheme for Datacenter Ethernet Networks

Enhanced Forward Explicit Congestion Notification (E-FECN) Scheme for Datacenter Ethernet Networks Enhanced Forward Explicit Congestion Notification (E-FECN) Scheme for Datacenter Ethernet Networks Chakchai So-In, Raj Jain, and Jinjing Jiang Department of Computer Science and Engineering Washington

More information

BROADBAND AND HIGH SPEED NETWORKS

BROADBAND AND HIGH SPEED NETWORKS BROADBAND AND HIGH SPEED NETWORKS ATM SWITCHING ATM is a connection-oriented transport concept An end-to-end connection (virtual channel) established prior to transfer of cells Signaling used for connection

More information

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized

More information

Performance Evaluation of WiFiRe using OPNET

Performance Evaluation of WiFiRe using OPNET Performance Evaluation of WiFiRe using OPNET Under the guidance of: Prof. Sridhar Iyer and Prof. Varsha Apte Dept. of CSE (KReSIT) July 16, 2007 Goal Goal Building. Finding minimum slot length to support

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT Hardware Assisted Recursive Packet Classification Module for IPv6 etworks Shivvasangari Subramani [shivva1@umbc.edu] Department of Computer Science and Electrical Engineering University of Maryland Baltimore

More information

ECEN 449 Microprocessor System Design. Memories. Texas A&M University

ECEN 449 Microprocessor System Design. Memories. Texas A&M University ECEN 449 Microprocessor System Design Memories 1 Objectives of this Lecture Unit Learn about different types of memories SRAM/DRAM/CAM Flash 2 SRAM Static Random Access Memory 3 SRAM Static Random Access

More information

2. SDRAM Controller Core

2. SDRAM Controller Core 2. SDRAM Controller Core Core Overview The SDRAM controller core with Avalon interface provides an Avalon Memory-Mapped (Avalon-MM) interface to off-chip SDRAM. The SDRAM controller allows designers to

More information

DDR SDRAM SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB

DDR SDRAM SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB 512MB, 1GB (x64, DR) 200-Pin DDR SODIMM Features For component data sheets, refer to Micron s Web site: www.micron.com Features 200-pin, small-outline dual

More information