A General Purpose Queue Architecture for an ATM Switch

Similar documents
Digital Audio and Video in Industrial Systems

Fast Region-of-Interest Transcoding for JPEG 2000 Images

Quality of Service (QoS)

Switch Architecture for Efficient Transfer of High-Volume Data in Distributed Computing Environment

The Network Layer and Routers

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

EVALUATION OF EDCF MECHANISM FOR QoS IN IEEE WIRELESS NETWORKS

Improving QOS in IP Networks. Principles for QOS Guarantees

Network Model for Delay-Sensitive Traffic

Network Layer Enhancements

Real-Time Protocol (RTP)

Multimedia Applications over Packet Networks

Stop-and-Go Service Using Hierarchical Round Robin

Connection Admission Control for Hard Real-Time Communication in ATM Networks

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

CSC 4900 Computer Networks: Network Layer

Multimedia Networking and Quality of Service

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1

Chapter 24 Congestion Control and Quality of Service 24.1

Wide area networks: packet switching and congestion

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

Chapter 4 Network Layer: The Data Plane

Traffic priority - IEEE 802.1p

Lecture Outline. Bag of Tricks

Computer Networks LECTURE 10 ICMP, SNMP, Inside a Router, Link Layer Protocols. Assignments INTERNET CONTROL MESSAGE PROTOCOL

Quality of Service in the Internet

Lecture 16: Network Layer Overview, Internet Protocol

Quality of Service. Traffic Descriptor Traffic Profiles. Figure 24.1 Traffic descriptors. Figure Three traffic profiles

MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY. Empirical Testing of Algorithms for. Variable-Sized Label Placement.

QOS IN PACKET NETWORKS

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097

Cisco Series Internet Router Architecture: Packet Switching

Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks

Scheduling. Scheduling algorithms. Scheduling. Output buffered architecture. QoS scheduling algorithms. QoS-capable router

Multi-path Transport of FGS Video

DISTRIBUTED EMBEDDED ARCHITECTURES

Design and Evaluation of a Parallel-Polled Virtual Output Queued Switch *

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Leaky Bucket Algorithm

William Stallings Data and Computer Communications. Chapter 10 Packet Switching

Traffic contract. a maximum fraction of packets that can be lost. Quality of Service in IP networks 1. Paolo Giacomazzi. 4. Scheduling Pag.

Under My Finger: Human Factors in Pushing and Rotating Documents Across the Table

Routing in packet-switching networks

Quality of Service in the Internet

Unit 2 Packet Switching Networks - II

Implementing High-Speed Search Applications with APEX CAM

IMPLEMENTATION OF CONGESTION CONTROL MECHANISMS USING OPNET

This Lecture. BUS Computer Facilities Network Management. Switching Network. Simple Switching Network

EP2210 Scheduling. Lecture material:

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

A QoS aware Packet Scheduling Scheme for WiMAX

Model Checking of the Fairisle ATM Switch Fabric using FormalCheck

FDDI-M: A SCHEME TO DOUBLE FDDI S ABILITY OF SUPPORTING SYNCHRONOUS TRAFFIC

REAL TIME AND EMBEDDED SYSTEMS (M)

CS 856 Latency in Communication Systems

Routing in a Delay Tolerant Network

Routing Strategies. Fixed Routing. Fixed Flooding Random Adaptive

Modular Quality of Service Overview on Cisco IOS XR Software

Quality of Service (QoS)

Chapter 10. Circuits Switching and Packet Switching 10-1

BROADBAND AND HIGH SPEED NETWORKS

Lixia Zhang M. I. T. Laboratory for Computer Science December 1985

A Preferred Service Architecture for Payload Data Flows. Ray Gilstrap, Thom Stone, Ken Freeman

Dynamic Scheduling Algorithm for input-queued crossbar switches

Network Support for Multimedia

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview

Episode 5. Scheduling and Traffic Management

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Computer Networking. Queue Management and Quality of Service (QOS)

UBR Congestion controlled Video Transmission over ATM Eltayeb Omer Eltayeb, Saudi Telecom Company

Router Architectures

Modelling a Video-on-Demand Service over an Interconnected LAN and ATM Networks

Chapter 4 Network Layer

Wireless Networks (CSC-7602) Lecture 8 (15 Oct. 2007)

H3C S9500 QoS Technology White Paper

Network Layer: Router Architecture, IP Addressing

Chapter 23 Process-to-Process Delivery: UDP, TCP, and SCTP

Mohammad Hossein Manshaei 1393

Queuing. Congestion Control and Resource Allocation. Resource Allocation Evaluation Criteria. Resource allocation Drop disciplines Queuing disciplines

Router s Queue Management

Chapter 4 Network Layer: The Data Plane

CSC 401 Data and Computer Communications Networks

White Paper Enabling Quality of Service With Customizable Traffic Managers

Performance of a Switched Ethernet: A Case Study

A Viewer for PostScript Documents

Adaptive-Weighted Packet Scheduling for Premium Service

CSE 461 Quality of Service. David Wetherall

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks. Congestion Control in Today s Internet

SIMULATION-BASED COMPARISON OF SCHEDULING TECHNIQUES IN MULTIPROGRAMMING OPERATING SYSTEMS ON SINGLE AND MULTI-CORE PROCESSORS *

QoS MIB Implementation

SAMBA-BUS: A HIGH PERFORMANCE BUS ARCHITECTURE FOR SYSTEM-ON-CHIPS Λ. Ruibing Lu and Cheng-Kok Koh

Networking Issues in LAN Telephony. Brian Yang

Last time. BGP policy. Broadcast / multicast routing. Link virtualization. Spanning trees. Reverse path forwarding, pruning Tunneling

Switching Using Parallel Input Output Queued Switches With No Speedup

High-Speed Cell-Level Path Allocation in a Three-Stage ATM Switch.

QoS Enhancement in IEEE Wireless Local Area Networks

Priority Traffic CSCD 433/533. Advanced Networks Spring Lecture 21 Congestion Control and Queuing Strategies

Routing, Routers, Switching Fabrics

Random Early Detection (RED) gateways. Sally Floyd CS 268: Computer Networks

Transcription:

Mitsubishi Electric Research Laboratories Cambridge Research Center Technical Report 94-7 September 3, 994 A General Purpose Queue Architecture for an ATM Switch Hugh C. Lauer Abhijit Ghosh Chia Shen Abstract This paper describes a general purpose queue architecture for an ATM switch capable of supporting both real-time and non-real-time communication. The central part of the architecture is a kind of searchable, self-timed FIFO circuit into which are merged the queues of all virtual channels in the switch. Arriving cells are tagged with numerical values indicating the priorities, deadlines, or other characterizations of the order of transmission, then they are inserted into the queue. Entries are selected from the queue both by destination and by tag, with the earliest entry being selected from among a set of equals. By this means, the switch can schedule virtual channels independently, but without maintaining separate queues for each one. This architecture supports a broad class of scheduling algorithms at ATM speeds, so that guaranteed qualities of service can be provided to real-time applications. It is programmable because the tag calculations are done in microprocessors at the interface to each physical link connected to the switch. Presented at Massachusetts Telecommunications Conference, University of Massachusetts at Lowell, Lowell, MA, October 25,994. This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories of Cambridge, Massachusetts; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories. All rights reserved. Copyright Mitsubishi Electric Research Laboratories, Inc., 994 2 Broadway, Cambridge, Massachusetts 239

Revision History:. First version: July 28, 994 2. Revised September 3, 994

A General Purpose Queue Architecture for an ATM Switch Hugh C. Lauer, Abhijit Ghosh, Chia Shen Mitsubishi Electric Research Laboratories 2 Broadway Cambridge, MA 239 Abstract This paper describes a general purpose queue architecture for an ATM switch capable of supporting both real-time and non-real-time communication. The central part of the architecture is a kind of searchable, self-timed FIFO circuit into which are merged the queues of all virtual channels in the switch. Arriving cells are tagged with numerical values indicating the priorities, deadlines, or other characterizations of the order of transmission, then they are inserted into the queue. Entries are selected from the queue both by destination and by tag, with the earliest entry being selected from among a set of equals. By this means, the switch can schedule virtual channels independently, but without maintaining separate queues for each one. This architecture supports a broad class of scheduling algorithms at ATM speeds, so that guaranteed qualities of service can be provided to real-time applications. It is programmable because the tag calculations are done in microprocessors at the interface to each physical link connected to the switch. Problem Statement A modern high speed network needs to support hard real-time applications such as plant control, continuous media such as audio or video, as well as ordinary data communications. In order to do this, it is necessary to be able to schedule the real-time communications to meet quality of service goals and to protect them from the unpredictable demands of the non-real-time communications, all the while providing the remaining bandwidth to the non-real-time communications. There are two parts to the real-time scheduling problem: an off-line part dealing with the setting up of connections (admission control); and an on-line part involving the actual scheduling of individual messages, packets, or cells as they arrive. A large body of theory about real-time scheduling has yielded many results predicting that certain combinations of off-line and on-line scheduling algorithms can guarantee certain qualities of service if certain a priori conditions are satisfied. In real-time ATM networks, the challenge is the on-line part, i.e., cell-by-cell scheduling. For a network operating at 622 megabits per second, an ATM switch must make a scheduling decision for each physical link within each cell cycle of 68 nanoseconds. In the general case, there will be n virtual channels with cells awaiting transmission, each of which has its own scheduling requirements. Therefore, each decision involves selecting the first cell from the virtual channel with the highest priority, earliest deadline, most frequent rate, or other parameter. Address: Mitsubishi Electric Research Laboratories, 5 E. Arques Avenue, Sunnyvale, CA 9486

This scheduling decision can be reduced to a comparison to select the minimum or maximum from among a set of n numerical values. There is no way around this n-way comparison. It is the fundamental complexity of real-time scheduling, and it must be done once during every scheduling interval. Obviously, if n is very small, it is not difficult. But if n is large, as is likely to be the case in a network with many real-time channels with widely varying characteristics, the comparison becomes a significant part of the design of the switch. Shared-buffer Switch Architecture At Mitsubishi Electric, ATM switches use a shared buffer architecture, an unfolded view of which is shown in Figure. The central fabric of the switch is a 2k-port memory, where k is the number of physical links connected to the switch. During each cell cycle, k cells arrive at the k input sides of the physical links, shown on the left, and are concurrently written into the shared cell buffer memory. At the same time, k cells are read from the same memory for transmission to the k output sides of the links. Scheduling is merely a matter of selecting which cells to transmit at which time. The advantage of this architecture over the more common input- or output-buffered switches is that less total memory is required because it can be statistically multiplexed among the physical links. The total bandwidth of the memory must be 2 * k * B cells per second, where B is the bandwidth of the individual links. This is only slight more stringent than the buffers of input- and output-buffered switches, which require k separate buffers with bandwidth ( k + ) * B each in Destination Info Tags Buffer Addresses Control Queue and Search Mechanism Input Input Scheduling Info Output Output Cell Buffers Input Cell Headers and Bodies Figure Shared-buffer ATM Switch Architecture Output September 3, 994 2 Technical Report 94-7

order to be non-blocking. Therefore, the shared-buffer architecture makes better use of the physical resources of the switch [Oshima92]. Selectable Queue Circuit One way to do cell-by-cell scheduling is to use a self-timed FIFO with a search circuit such as that described in [Kondoh94] and illustrated in Figure 2. This circuit contains a linear array of registers, as many as there are cell buffers in the memory. Each register contains a cell address, a priority value, and a k-bit destination vector in which one bit represents each physical link. A value of one in a position of the destination vector means that the cell addressed by the entry waiting to be transmitted to that link. During each cell cycle, the circuit is searched k times, once for each link. Each search finds the first entry containing a one bit in the corresponding position of the destination vector i.e., nearest the head of the queue (at the bottom of Figure 2). The cell address is retrieved, the cell at the buffer address is transmitted, and the bit of destination vector is reset. When all of the destination vector bits of an entry are zero, the entry is deemed to be vacant. Then the internal, self-timed circuitry propagates all subsequent entries forward to fill up vacant spaces. In this sense, it acts like a traditional FIFO circuit. New entries are always added to the tail of the queue (shown at the top in the Figure). from tail of queue cell address pri destination bits Bus / Read Circuit Search Circuit Search Direction output to head of queue Figure 2: Self-time FIFO with search circuit self-timed propagation circuitry To support real-time communication, we use a modification of this circuit in which selection takes account of the priority information encoded in numerical tags. Instead of merely selecting the first entry with a particular destination bit set, this circuit selects the entry with both the smallest tag value and a destination bit of one. A block diagram of the circuit is shown in Figure 3. September 3, 994 3 Technical Report 94-7

This figure shows only one column of destination bits (corresponding to one physical link), a tag register for each entry, comparators between adjacent entries, and a simplified selection circuit. Not shown are the cell addresses or self-timed circuitry which causes entries to propagate from the tail to the head of the queue. The comparators all operate in parallel. Each compares the value of its tag register with the output of the previous comparator (towards the tail of the queue). If the destination bit is one, it returns the smaller of these two values for use by the comparator of the next entry (toward the head of the queue); otherwise, it simply propagates the value from the previous entry to the next entry. Each comparator also outputs a single bit, which has a value of one if and only if (a) the destination bit of that entry is one, and (b) the tag of that entry is not greater than any tag propagated from entries toward the tail of the queue. The selection circuitry on the right is identical to that of Figure 2 and finds the first such entry. Dest. bit () () () () from tail of queue toward head of queue Value M i+3... Tag T i+2 Comparator... Tag T i+ Comparator... Tag T i Comparator... Tag T i- Comparator Value M i- C i+2 C i+ C i C i- to upper stage from lower stage lookahead block Figure 3 Tag-based Selectable Queue Circuit Selection Outputs The queue circuit of Figure 3 is an extremely simplified one. For a practical circuit in a k-by-k shared-buffer switch operating at 622 megabits per second, the circuit must search the queue k times in each 68 nanosecond cell cycle. If k = 6, for example, each search must be completed in about 4 nanoseconds. Even with look ahead, the ripple effect of the comparators takes too long. Therefore, we actually partition the search into sections, each one producing a minimum tag value. Then these compare the results of the sections hierarchically to select the minimum overall tag value. Experience with an evaluation circuit reported by Kondoh, plus estimates based on.8µ CMOS design rules, indicates that search can be completed in less than 3 nanoseconds for 52 queue entries with 6-bit tags. September 3, 994 4 Technical Report 94-7

Implementation of Real-time Scheduling Algorithms This scheduling mechanism is extremely flexible and general and can support a broad class of real-time and non-real-time scheduling algorithms. The key feature is the ability to compute tags for individual cells when they arrive at the switch. For example, In rate monotonic systems, tags represent intervals between periodic messages. Tasks requiring messages with higher frequencies are assigned smaller tags. Tags can be pre-computed when virtual channels are set up, then assigned by table look-up when cells arrive. In the Virtual Clock algorithm [Zhang9], tags represent the logical arrival times of cells. These are computed by taking the actual arrival time of each cell, then adjusting them as necessary to reflect early arrivals, etc. In the Earliest Deadline First algorithm [Zheng94], tags represent deadlines by which time cells must be transmitted. They are calculated by adding pre-computed delay values to logical arrival times. In addition, round robin, weighted fair queuing, and other algorithms can be supported by defining appropriate tag-calculation functions to be applied to incoming cells arriving at the switch. Moreover, multiple classes of traffic can be supported simultaneously by partitioning the space of tag values and assigning tags in the different subspaces by different algorithms. In some of the algorithms, such as Virtual Clock and Earliest Deadline First, tag values increase monotonically for each virtual channel. Eventually, they will overflow the tag field of the registers in the queue. To cope with this, a control signal is provided which, when raised, causes each tag to be treated as having one additional high order bit. The value of this bit is the opposite of the actual high-order bit of the tag. Initially, all tags are zero. After a while, tags of newer entries in the queue have high-order bits of one. Assuming that there are enough bits in the tag fields and that all entries are chosen within a predictable period of time, it will be the case that all tags in the queue will have high order bits of one. I.e., earlier entries will have been selected and transmitted and the tag values of new entries will not have wrapped around back to zero again. At this point, the control signal is raised. When the tag values eventually do wrap around back to zero, they will be treated as being larger than the earlier queue entries with ones in the high-order bits of theirs tags. Later still, all entries with ones in the high order bits of their tags will have been removed from the queue, at which time the control signal can be lowered again. Note that this depends upon the tag values having enough bits so that the interval between wrap around is greater than the lifetime of any entry in the queue. Summary A queue architecture for a shared-buffer ATM switch has been described. This architecture supports real-time communication traffic by allowing cell-by-cell scheduling according to a wide variety of scheduling algorithms. The central feature of the architecture is a tag-based FIFO queue circuit containing entries for all cells to all destinations. It can be searched for the cell with the smallest tag for a particular destination. In effect, it provides the necessary queuing without September 3, 994 5 Technical Report 94-7

requiring separate queues for individual virtual channels. The queue circuit can be implemented in VLSI and can control a 6-by-6 switch at 622 megabits per second. References [Kondoh94] H. Kondoh, H. Yamanaka, M. Ishiwaki, Y. Matsuda, and M. Nakaya, An Efficient, Self- Timed Queue Architecture for ATM Switch LSI s, Proceedings. of Custom Integrated Circuit Conference, San Diego, May 994. [Oshima92] K. Oshima, H. Yamanaka, H. Saito, H. Yamada, S. Kohama, H. Kondoh, Y. Matsuda, A New ATM Switch Architecture based on STS-type Shared Buffering and its LSI Implementation, Proceedings of International Switching Symposium 92, Yokohama, Japan, October 992, pp. 359-363. [Zhang9] L. Zhang, Virtual Clock: A new traffic control algorithm for packet-switched networks, ACM Transactions on Computer Systems, vol. 9, No. 2, May 99, pp. -24. [Zheng94] Q. Zheng, K. Shin, and C. Shen, Real-time Communication in ATM Networks, to appear in 9th Annual Local Computer Network Conference, Minneapolis, Oct. 2-5, 994. September 3, 994 6 Technical Report 94-7