Advanced Core Router Architectures A Hardware Perspective

Size: px
Start display at page:

Download "Advanced Core Router Architectures A Hardware Perspective"

Transcription

1 Advanced Core Router Architectures A Hardware Perspective Muzammil Iqbal Department of Electrical and Computer Engineering University of Delaware, DE 976. Abstract In the environment of high performance IP network and growing consumer demand for Internet bandwidth has levied unprecedented load on today s Internet Service Provider (ISP). Gateways or Routers as they have come to be known, play an essential role in modern Internet infrastructure. Their design has seen tremendous advancements during the last decades in order to sustain the high bandwidth requirements at the core. In this paper we identify important trends in design and some design issues facing the fourth generation of IP routers. This area encompasses vast topics and issues but it is not possible to cover them all. Therefore we restrict our survey to architectures of high-end backbone routers from hardware viewpoint. Introduction The popularity of the Internet has caused the Internet traffic to grow drastically every year. The predicted growth is that the Internet traffic triples every three to four months and it is expected to see a daily volume of 000 Terabits by 2003 which is almost 0 times of traffic volume traversing the Internet today [Pluris 7]. This will incapacitate the presently deployed Core having throughput in gigabit regime. Another factor is the advancement in optical fiber communication technology, which with the advent of DWDM has made it possible to carry several channels over a single fiber each having a capacity of several gigabits. 6-channel OC-92 ( 0 Gbps) are not uncommon these days and OC-786 speed shall be available soon. In such scenario, an obvious bottleneck is the routing/switching device, which is not compatible with available ultra-fast communication technology. The reason is inherent in functionality of these devices that cause them to run at a rate slower than the line rate. Also with the advent of QoS applications running complex algorithms, the switching speeds have been adversely affected. Therefore a great deal of research interest lies in building high capacity and high-speed routers that can scale to predicted growth trends as well as provide advanced QoS services like the Integrated services or Differentiated service models. Our paper is focused on the discussion elaborating design trends for backbone routers, therefore scheduling and route look-up algorithms although alluded to, but are not discussed in detail. The rest of the paper is organized as follows. In 2, we have briefly discussed initial design trends resulting in several generations of router architectures. In 3 &, Single bus shared memory architecture and Crossbar switch architecture has been discussed in greater detail. 5 describes various techniques employed for switch arbitration and control, as well as discusses hardware implementation of a fast crossbar scheduler. 6 deals with endeavors to build VLSI based Internet routers having throughput in terabit regime. Finally in 7, we discuss the emerging idea of using terabit capacity optical backplanes to switch data at a very high speed. Section 8 presents the conclusion. 2 Essential Elements of a Routing system A router is a layer 3 device in the OSI layer model. It has two fundamental functions: ) Routing & 2) Forwarding of IP packets. The routing process gathers information about the network topology and creates a Routing table. The packet forwarding process copies a packet from an input interface to proper output interface based on the information contained in the forwarding table. Any routing system requires four essential elements to implement the routing and forwarding process: routing software, packet processing, a switch fabric and line cards (Fig 2.). For any system designed to operate at the core of the Internet, all four elements must be equally efficient. Routing Engine Forwarding Engine Main Processor Routing Table Packet Processing Forwarding Table Switch Fabric Routing Software Fig 2.: Block diagram of a Generic Router

2 The main processor runs the routing software, which performs the routing functions and maintains information of Internet topology through the routing protocol. The switch fabric performs the forwarding operation using the forwarding processor, which is usually an ASIC optimized to perform specific task with high speeds. At this point it is appropriate to mention that in today s backbone routers, a usual practice is to maintain another table called the forwarding table at the forwarding engine (FE). FE actually reads the destination address from the packet header performs route lookup, finds the best match and forwards the packet to determined output interface. The forwarding table is essentially derived from the routing table but is less often updated. The routing table is maintained by the routing software that performs relatively slow processing while updating the table and once the table has been updated, its image is transferred to the forwarding engine. This image can be a subset of the complete routing table or may be modified to fit into a smaller version. The switch fabric is essentially the core of the forwarding engine. In essence the router comprises of the routing engine, implemented in software and uses the main CPU of the router to carry out more complex operations like running the routing protocol, traffic engineering features, QoS guarantees, modifying the packet header before departure and other software based features of the router. The other part, Forwarding engine carries out simpler tasks at a very high rate. Route Processor CPU DMA M LINE CARD MAC A Fig 2.2 First Generation Architecture 2. Legacy Architectures DMA M M A LINE CARD MAC Memory DMA M A M A LINE CARD MAC BUS Over the years, there has been many different architecture used for routers. Particular architectures have been selected based on a number of factors, including cost, number of ports, required performance and currently available technology. The detailed implementation of individual commercial routers has generally remained proprietary, but in broad terms all routers have evolved in similar ways and have followed a common trend of development. The first trend has been the implementation of more and more of data-path functions in hardware. In recent years, improvements in the in the integration of CMOS technology has made it possible to implement larger number of functions in ASIC components specially those that were traditionally implemented in software. Cache Update DMA Route Cache Memory MAC Route Processor CPU Packet Path DMA Route Cache Memory MAC Memory Fig 2.3 Second generation Architectures DMA Route Cache Memory MAC The First Generation of routers were simpler from architectural viewpoint in that they used a centralized processor, centralized buffer and a centralized bus connected to the line cards. The incoming packets had to traverse the same bus in order to be scheduled on an output interface. The interface cards were dumb I/O devices with no packet processing capabilities. This design clearly shows deficiencies in that the bus can only be used by one line card at a time. Moreover a packet has to traverse the bus twice after arrival on an ingress port. First, it is written into the memory while route lookup and scheduling decision is taken by the route processor and secondly once such a decision has been reached it is de-queued from the memory and traverses the bus to reach an appropriate output interface(s). This is illustrated in Fig Another deficiency was that a general purpose CPU was the workhorse of the device. All functions pertaining to routing and forwarding processes had to be performed by same processor levying tremendous load, and contribute to the bottleneck of the system. The second generation design as shown in Fig 2.3 added a special purpose ASIC processors and some memory in the interface cards which can do the packet header lookup to retrieve destination information and also buffer the packet until the time bus is available for it to traverse. The satellite processors in the line cards each kept only a modest cache of recently used routes, which enables the line card to perform route lookup and to depends on the central processor only for bus arbitration. This cache is periodically updated. If a route is not found in the route cache the main processor looks it up. This technique relieves the processing load off the central CPU but bus arbitration is still a bottleneck. The second-generation architectures were short lived because of the inability to support higher throughput BUS 2

3 requirements at the core. The major optimization deficiencies pointed out by the newly emerging crossbar architecture regime were Congestion: i.e. the bandwidth is shared among all the ports, leading to contention and additional delay (forwarding delays). Also in severe congestion conditions where the arrival rate of packets exceeded the capacity of the bus, buffers would overflow and data would be lost. Secondly, high speed shared buses are difficult to design as the electrical loading caused by multiple ports on a shared bus, the number of connectors that signal encounters, and reflections from the end of unterminated lines lead to limitations on the transfer capacity of a bus. Another bus based multiple processor router architecture is described in [Asthana 8] uses multiple forwarding engines in parallel to achieve high packet processing rates as shown in Fig. 2.. As a packet is received, the IP header is stripped by the control circuitry, augmented with an identifying tag, and sent to a forwarding engine for validation and routing. While the forwarding engine is performing the routing function, the remainder of the packet is deposited in an input buffer in parallel. The forwarding engine determines which outgoing link the packet should be transmitted on, and send an updated header fields to the appropriate destination interface module along with the tag information. The packet is then moved from the buffer in the source interface module to a buffer in the destination interface module and eventually transmitted to an outgoing link. Forwarding Engine Forwarding Engine Forwarding Engine Network Interface Network Interface Resource Control Network Interface Fig 2. Bus based Architectures with multiple parallel forwarding engines 3 The shared memory architecture Control Bus Forwarding Bus Data Bus The design described above forms a premise for shared memory architecture, which employs it for building higher end backbone routers. It belongs to a contemporary regime that competes with the crossbar switch fabric regime, and both have high-end devices in the market targeting core applications. This design belongs to the third generation of router architectures. It employs an ultra high speed, high capacity bus with a logically centralized memory that is physically distributed over multiple forwarding engines working in parallel. In this section we discuss the architecture of a shared memory based router from Juniper, Inc. Invariably all higher end routing systems of today use specialized hardware, which tends to become more and more complex as technology advances. Router vendors use these special purpose ASICs for varied functions as packet processing, route lookups, packet classifications, maintaining of per flow states for QoS and traffic engineering. The complexity of such hardware grows exponentially as you increase their functionality. The processing time for basic forwarding operations is the benchmark for overall device efficiency. Currently the CMOS process has reached 0.8 microns and even 0.3 microns for state of the art Microprocessors and RISC processors. With this level of gate density it is now possible to integrate several millions of transistors on a single ASIC. These high-density chips lend themselves well for more and more complex network operations unprecedented so far. The point of this discussion to emphasize that performance of a routing device has seen tremendous enhancement owing to rapid growth of VLSI technology. We have basically two applications for large-scale electronic integration. Firstly, in the switch fabric, where the connection switching latency also referred to as the cell time is an important factor in determining the overall data throughput of the device. Secondly as said above, the specialized ASICs have also evolved into massively sophisticated devices, they offer very high-speed functionality for operations thus far implemented in software in the main routing engine. An appropriate reference would be to QoS and traffic engineering functions, which limited their scope of implementation due to complexity and slower execution in software. The Juniper M-series routers have been offered with 0 to 60 Gbps backplanes. Such high throughput requires highspeed transceiver circuitry to retrieve the data from line card, very efficient buffering, fast route lookup and ultra high speed centralized bus to switch the packet from one interface to another. There are also more subtle issues involved in case of a shared memory system, such as memory management, control signaling and their efficient implementation and coordination can form a premise for a high speed device. The design can be generalized as an emulation of Output Queued, fixed length cell switch. As we discuss in the latter sections as well, it is becoming a trend in gigabit and terabit regimes to adopt fixed length cell switching due to ease of implementation of communication channels. This idea has been adopted from ATM switches and utilized in a packet switching device instead of circuit switched. The centralized memory architecture has following functions: 3

4 ) All input packets must be fragmented into fixed length 6 byte cells and new header (notification) containing control information is generated. This is performed at the ingress port by an ASIC. 2) The cells are forwarded to a buffer management ASIC for storage in the memory. Input Interface Packets Fig 3. Notification Distributed Buffer Manager Blocks I/O Manager Route Lookup Internet Processor Shared Memory Packet Forwarding Engine of M-Series Juniper Router 3) The notification information for each packet is forwarded to the Internet processor ASIC, which performs the route lookup and takes the scheduling decision according to state of centralized bus. ) The Internet processor ASIC also performs bus arbitration and schedules the packet on an output interface. Once this decision is taken the relevant buffer management ASIC is notified to be de-queued from the memory. 5) However, instead of packet the output notification generated by Internet processor ASIC is queued on the output port. When it reaches head of line in the queue, the cells pertaining to that notification are dequeued from the distributed memory locations and assembled at the output port before departure. 00 Mbps PCI bus Distributed Buffer Manager I/O Manager Output Interface Packets It should be emphasized that internal control communication overhead is heavily reduced with exchange of notifications which amount to 53 bytes of data. The actual packet resides in the memory until it output notification has reached the head of line in output port. Fig 3.shows the logical process of packet forwarding. Actual hardware implementation is more distributed as shown in Fig. 3.2.The packet-forwarding engine constitutes the main switching fabric of the device. Shared memory is distributed over several Flexible packet concentrators but logically it acts as a single entity. The architecture is designed in a hierarchical manner. The line cards or Physical interface cards (PIC) as they call it terminates in Flexible PIC concentrators (FPC). The FPC is includes a 28 MB DRAM memory bank, which is a part of large pool, constituted by in all eight FPCs. The I/O manager responsible for memory access and packet fragmentation is also housed on the FPC. The FPCs are connected to a pool of Packet Forwarding modules containing the distributed buffer manager ASIC and Internet processor ASIC. Their functionality is described in greater detail below. Flexible PIC Concentrators (FPC) Four Physical Interface s (PICs): These refer to the line cards. Receives and Transmits Data packets from Network. Performs Framing, Speed Signaling and encapsulation. Two Packet Director ASICs: Distributes Incoming Packets from PICs among I/O managers. The process allows for uniform distribution of traffic load among several devices to achieve some level of parallelism. Four I/O Manager ASICs: Parse Layer 2 & 3 data, performs encapsulation & segmentation of data packets into 6 byte cells. Reads and writes data to memory according to address assigned by Buffer manager. The I/O manager is also responsible for packet reassembly at the time of departure. It queues the packet notifications generated by the buffer manager and awaits till a particular notification has reached the head of line of particular queue, at this time the cells are de-queued from the memory and reassembled for departure. Shared Memory: This comprises of eight identical 28-MB SDRAMs. Essentially a two way, highly Non Blocking memory that offers single stage buffering capability. The memory is physically distributed over several FPCs. Logically it form member to the overall memory pool of the system. The packets are stored in the memory independent of their arrival on a particular FPC. Switching and Forwarding Module (SFM) High Speed 8 X OC 8 link 20Gbps (half Duplex) Fig 3.2 Architecture of M-series Forwarding Engine Internet Processor II ASIC: Centralized Processor for Route Look-up and forwarding decision. Performs longest match lookups at a rate of 0 million routes per second, which is

5 sufficient for wire-speed lookup rates. It maintains forwarding tables for scheduling decision. The forwarding table is an image derived from the main routing table maintained by the routing engine. During updating of the routing table, essentially forwarding tables are blocked to avoid inconsistent and spurious scheduling but this also contributes to system blockage, undesirable in high end routers where a delay can cause huge back locks. To prevent from blocking of forwarding table during periods of route instability, the Internet ASIC maintains two forwarding tables. During the time the forwarding/routing table is being updated scheduling is performed on basis of secondary table until the primary one has been updated to replace it. This mechanism is called Atomic updates. It is highly programmable via JUNOS Operating system to cater for QoS and traffic engineering requirements. Distributed Buffer Manager ASIC: Manages 8 X 28Mb memory on each FPC as parts of systems shared memory pool. Generates notifications on packet arrival, communicates with the Internet processor ASIC and upon scheduling decision, de-queues the packet from the memory. Mid Plane: Shared Memory Interconnect of size 8 X OC8! "#$ #"&%&'( )*#+,-$ for different models and range from 0-60 Gbps full duplex. This is the main switching bus of the system. It forms the communication link between the FPCs and Switching forwarding modules. The shared memory architecture is prone to several inefficiencies. We describe some of them. The first issue is that segmenting packets into cells becomes a problem when the packet size exceeds the cell size with a minor difference. In a 6 byte cell size system, a packet of length 65 bytes would consume 2 cell space of 28 bytes. This becomes a problem when we have varied lengths of packet arriving at the interfaces, which is usually the case. Another problem associated with all fixed length cell devices is the overhead generated due to notification information for all cells pertaining to different packets. This information is vital for correct reassembly of the packets at the output interface. For an Internet backbone scale router as discussed above this overhead can be very large due to sheer volume of traffic serviced by it. Also the generation of notification, processing of control information, segmentation and reassembly, storage and de-queuing all contribute to overall packet service latency of the system. Note that sustaining service levels for IP multicast traffic is particularly problematic in any shared-memory architecture. The reason is that each multicast packet is written to a shared memory just once, but needs to be read multiple times by different line cards, all from the same address. This situation increases the chances of other packets getting blocked. In the next section we describe the crossbar fabric regime of backbone Internet routers. These also belong to the third generation of router architectures. The Crossbar Switch fabric Architectures The essential architecture of Crossbar architecture remains unchanged as the 2 nd generation designs with a major change of shared bus fabric with a crossbar switch fabric. The crossbars have been widely used in switching applications like Telephony, parallel computer communication and data communications. It is being extensively used for building higher end backbone routing and switching devices due to ease of implementation and largely due to alleviation of bus contention issues inherent with single bus architectures. The architecture comprises of line cards, main routing engine, several forwarding engines associated with the line cards and a switch fabric. This is depicted in Fig.. In an NXN crossbar, having N input and N output ports an input can be connected to any of the output simultaneously with other connections. This is the non-blocking property of the crossbar architecture and alleviates the bus contention issues as in single bus system. Route Processor Switch Fabric Fig.. Crossbar based Fabric. Forwarding Engine Forwarding Engine Forwarding Engine The forwarding process can be roughly described by following stages ) As a packet arrives on an input line card the forwarding engine performs basic error checking to ascertain an error free packet and header. Computes the hash offset into the forwarding table and loads the route. 2) The forwarding engine checks to see if the cached route matches the destination of the datagram in the route cache, if not, the forwarding engine carries out an extended lookup of the forwarding table associated with it. This forwarding table is actually extracted from the routing table maintained by the main routing processors, a process similar to the shared bus 5

6 architecture. On finding a match the engine check IP time-to-live (TTL) field and compute the updated TTL and IP checksum, and determines if the datagram is for and discuss hardware implementation issues of each of the architectures. With this viewpoint, in the next section we explain the implementation of a 6 X 6 crossbar switch fabric with simple implementation tools.. Hardware Architecture of 6 X 6 crossbar There can be many ways to efficiently implement the crossbar in VLIS hardware but in this section we present an overview on one of the ways of implementation using very simple hardware macros. This design is implemented in a 0.35 micron CMOS process and requires 600 MOS transistors. The overall area of the chip is.2mm X 0.6mm and operational voltage is 5 Volts. S0 S S3 S the router itself. Fig.2 Block diagram of a 6 X 6 crossbar 3) The updated TTL and checksum are put into the IP header. The necessary routing information is extracted from the forwarding table entry and the updated IP header is written out along with link-layer information from the forwarding table. ) As this process is completed, the packet is queued in the input buffer at the line card. The crossbar uses a sophisticated buffering technique to prevent head of line blocking. This is discussed later. The forwarding engine performing route lookup and header modification determines the output interface for the packet to be routed. This is notified to the centralized scheduler that controls the crossbar arbitration by maintaining all states of the crossbar. S0 S S3 S OUT 0 OUT 5) As a scheduling decision is reached, the forwarding engine gains access through port specific line card to the crossbar. The scheduler closes necessary switch points on the crossbar and the packet traverses it to the output line card. The above is a very brief and rough description of forwarding process, going into further details is not intended. Our basic aim is to elaborate working principles Fig.2 Multiplexer macro This fabric has a throughput of Gbps full duplex direction. It is implemented using cascaded X Multiplexers with 6

7 switching controlled through a 6-bit shift register. Fig.2 shows the block diagram representation of the device. This design is optimized to fit for digital data communications application and high-speed digital switching equipment. The fabric offers Gbps throughput with a switch latency of ns. The crossbar connection pattern is stored on 6 on-chip registers. A serial interface is used to configure the crossbar connection pattern in 6 clock cycles. The building blocks are the shift register, cascaded multiplexer array and fanout buffers. The multiplexer module is built from 5 Multiplexers cascaded in two stages. The 6 input pins are connected to each of the multiplexer macro in a manner shown in Fig. The shift register forms an important part of control circuit that stores the configuration patters of the crossbar and manipulates it according to directive of the scheduler. Fig. shows logic schematic for a 6-bit shift register with three input bits and 8 output bits. However in this case we only use of these output bits. The register is built out off DQ flip-flops cascaded together. Fig. logic schematic for 6-bit shift register Fig.3(a) Symbolic representation of X MUX (b) Logic Function The multiplexer macro comprises of a 5 X multiplexer set cascaded in two stages. All the 6 outputs from fan out buffers forms input to 6 inputs distributed on Muxes. The selection of MUX module is done on the basis of select lines S2 & S3 while the output of nth stage Out(n) is selected by select lines S0 & S. These select lines originate from the 6-bit shift register where the configuration pattern of the crossbar is stored. Fig.3a & b illustrates the block diagram of a X Mux along with its logic function. Input lines are selected from select inputs () and (2). Logic table for several configurations is shown in Fig.3(b). As an example when S2() & S3(2) are 0 &, maps to two states as shown in table above. This configuration of select lines select input 5 in Fig.3(a) which can either have a 0 or as an input, this input is latched to output 7. Therefore with varying the two select bits, we can select an input line. For the output MUX this selection is governed by S0 and S bits from the shift register and any one input from the input Muxes is selected to be output on Out(n) port. We have briefly discussed the implementation details of a crossbar with a view to provide the reader with an exposure of issues related to implementation in router architectures. The design above is not state-of-art and offers only a modest throughput of Gbps. There are several products in the market offering aggregate throughput of the order of several hundred Gbps, however these have been achieved using similar technologies but at a higher gate density level. Also the Mux implementation is not a standard in industry as it has larger switch latency unacceptable in high-end products. Some architectures use tri-state buffers as switching points of the crossbar due to compactness and faster switching speeds. There can be several other ways as well but due to limitation of space we would not discuss them. Also, the design details above only correspond to fabric part of the overall crossbar architecture and there are several other components, which play a vital role in determining the efficiency, foremost being the centralized scheduler and scheduling algorithm. In the next sub section we discuss the inherent inefficiencies of the crossbar design and how they have been rectified using more sophisticated algorithms like islip..2 The scheduling algorithm The crossbar is inherently non-blocking but it suffers from other problems like as Head of Blocking (HOL) [Mckeown 5]. Even under benign traffic patterns, HOL blocking limits the throughput to just 60% of aggregate 7

8 bandwidth. This can be alleviated using Virtual Output Queuing (VOQ) in which each input interface maintains a separate FIFO queue for each packets destined for each output interface. At the beginning of each time slot, a centralized scheduling algorithm examines the contents of all input queues, and finds a conflict-free match between inputs and outputs. The islip algorithm proposed in [Mckeown 6] is designed to meet the goals of highthroughput and starvation free arbitration process. During each time slot, multiple iterations are performed to select a crossbar configuration, matching inputs to outputs. The islip uses a rotating priority ( round-robin ) arbitration to schedule each active input and output in turn. A hardware implementation of this algorithm shows that it can operate at a very high speed and makes scheduling decisions at 0ns. This implementation is discussed in 5.. islip attempts to quickly converge on a conflict-fee match in multiple iterations, where each iteration consists of three steps. ) Request: Each input sends a request to every output for which it has a queued cell. 2) Grant: If an output receives any requests, it chooses the on that appears next in a fixed round-robin schedule starting from the highest priority element. The output notifies each input weather or not the request was granted. 3) Accept: If an input receives a grant, it accepts the one that appears next in a fixed, round robin schedule. Fig.5 Schematics of Arbitration scheduler and implementation By considering only unmatched inputs and outputs, each iteration matches inputs and outputs that were not matched during earlier iterations. Fig.5 highlights the iteration steps and way to implement it using a priority encoder. The islip algorithm offers high throughput of 00% for uniform and un-correlated arrivals. No connection is starved and service is guaranteed at most in N-iterations for an N X N crossbar system. It is simple to implement as discussed in next sections. In 5 we discuss various arbitration techniques for the crossbar architecture along with a description of implementation of islip scheduler. 5 Arbitration and Control techniques The bus arbitration algorithms are implemented largely in hardware ASICs due to their fast computational ability for dedicated tasks. A controller/arbiter therefore constitutes a vital component of overall Crossbar Switch architectures and dictates the performance of a Routing device. Two major approaches have been either to centralize the arbitration task on a single chip usually on a centralized scheduler or to distribute it on multiple chips to attain high speeds by exploiting parallelism. In large non-blocking crossbars, the arbiter may become a bottleneck in terms of performance and reliability. A major research issue has therefore been to reduce the latencies of connection setup resulting from this bottleneck. The arbitration time and subsequent connection setup operation can severely affect the overall efficiency when heavy traffic loads cause all the input ports to request a connection with output ports. This is more specifically true for high end backbone routers with forwarding speeds reaching 0 million packets per second (Mpps) which require very fast arbitration and scheduling chips to keep up with aggregate throughput of the system [Newman 8]. In what follows we intend to give an overview of two popular control schemes and their variants, as well as the implementation of islip scheduler [gupta ] to highlight some hardware aspects of the arbitration operation and tradeoffs.. 5. Centralized Versus Distributed Arbitration Conventional crossbar designs have N Input ports placed conceptually perpendicular to N Output ports, thus constituting an N X N matrix. The centralized Bus Arbiter receives a request from a port say P 0 for a connection to port P k. The arbitration unit will ascertain a free path through the fabric to port P k and will subsequently close the required switch points for a successful connection. A more specific Single-sided crossbar [Ghosh ] has N port with M buses perpendicularly bisecting each of the N port. For full duplex communication each port line is assumed to comprise of two wires and the same is true for buses. All the Switching points on a crossbar can be distributed on several chips to ease the complexity in design or with the current level of CMOS VLSI technology reaching 0.25 microns can be 8

9 placed on a single chip [Mckeown7] as well. In any case the arbitration decision is based on the availability of an open path to the output port via an internal or external switching action. An arbiter might receive following requests from a port.. A connect request to setup a communication path to another port 2. A disconnect request to terminate an existing connection between two ports. As described above the connection procedure requires the arbiter to ascertain an unused bus column connecting input and output port-lines for each connection request. With large crossbars reaching sizes of 72 X 72 port lines the computational overhead with use of centralized controller may become a bottleneck reducing the overall efficiency of the switch. A simple remedy is to use multiple arbitration units to reduce the connection setup latency. Several schemes have been proposed in literature like [Ghosh] [Zerrouk 3] describing methods to efficiently employ a distributed set of controllers thereby optimizing the efficiency and reliability. The distribution of request load uniformly among several controllers reduces the arbitration load on a single controller. The centralized scheme offers a single point of failure, which is undesirable, and hence the distributed scheme increases the reliability of switch by multiplying the points of failure. In the following subsection we describe two of such methodologies. 5.2 The Symmetric Triangular Scheme Arbitration & Control distribution can be achieved by dividing all source and destination port pairs into K subsets assuming that there are K controllers C C K. With N ports the total number of distinct port pairs are N(N-)/2. This is best illustrated by representing the port pairs in a triangular pattern. Fig illustrates this distribution for N= 6 ports and K = Controllers. An even distribution of port pairs is achieved if they are equally distributed among the controllers. It can be achieved by geometrically dividing the port-pair triangle horizontally into K regions of approximately equal area. This scheme however is still prone to blocking effect if the total number of buses is less than or equal to N/2+. Also each region pertaining to a controller C I represents a central point of failure for that region. In case of a failure, measures should be taken to redistribute the port-pair regions among the surviving controllers. This can result in disruption of services for a period of time that is undesirable. A more detailed description of this scheme is not permissible here due to space constraint. 5.3 The Chessboard Scheme This scheme, first proposed in [ghosh], is a variant of triangular scheme in the sense that it enhances the redundancy and fault tolerance by assigning two controllers to any service request (a, b). But this enhancement comes with the increase in complexity of the system. From the viewpoint of implementation this concern might not be sizeable due to current improvements in density of largescale integration devices. Moreover it is weighed against the hazard of service disruption which can be detrimental for performance in high-end routers or switches. Keeping in view the same triangular port pair distribution above, two controllers referred to as primary and secondary are assigned to each request. In other words two redundant controllers look after each distinct region of port pairs. A request (a, b) is handled by the primary controller and in case of its failure is forwarded to the redundant secondary controller. This scheme is further optimized by idea of dynamically distributing the service requests among K redundant controllers. This scheme offers the flexibility to adapt according to an individual controller load. As an example a request is dynamically assigned to a pair of servicing controllers, however any one of them depending on individual load conditions may service the request. 5. Arbiter for islip Scheduling Algorithm In this section we present brief implementation details of a classical crossbar scheduler and distributed arbiters with islip being the underlying algorithm. As described islip algorithm is an iterative combination of three distinct operations, i.e. the request-grant-accept cycle. This is implemented on separate arbiters in a pipelined manner. In the previous section Fig.5 illustrates the basic islip scheduling algorithm and the specific task of each arbiter therein. It is observed that a request from any of the N through N K ports is received by the grant arbiter and is accepted by an Accept arbiter on the line card. Pipelining allows for optimization in terms of lesser clock cycles. For this algorithm, I iterations consume only 2I+2 cycles instead of I for a non-pipelined scheme. The Centralized scheduler runs at a clock speed for 75Mhz and each time slot consists of 9 clock cycles. That is, for the three usual iterations involved in actually scheduling a packet from an input to an output port would require clock cycles. The scheduler is designed to configure 32 X 32 crossbar once every 5ns. The islip algorithm employs round-robin grant and accepts arbiters to ensure 00% throughput and approximately fair scheduling for all Input queues to avoid starvation. The round-robin arbiters are efficiently implemented as a programmable priority encoder (PPE). The PPE differs from a simple priority encoder in that an external input dictates which input has the highest priority. In what follow, we take a detailed look at the design of a high-speed round robin arbiter as a PPE. Fig 5. shows the circuit diagram of a round-robin arbiter. It has some state (called round-robin 9

10 pointer, P_enc, of width log N bits), that points to the current highest priority input. In every arbitration cycle, it uses this pointer P_enc to choose one among the N incoming requests, through a PPE. The Programmable priority encoder takes in N -bit wide requests and a log N wide P_enc as the input. It then chooses the first non-zero request value beyond Request[P_enc], resulting in an N-bit grant. Clearly, the core function of contention resolution is carried out by this combinational logic. round robin pointers such that Request is rotated i positions to the right before being presented to the multiplexer. The design is based on principle that the first non-zero request beyond the current pointer value can be found in a The pointer update mechanism is generally simple and can be performed in parallel. The path from request to grant determines the speed of arbitration decision and therefore the design of a fast PPE based on combinational logic is desirable. There are several ways to implement a fast PPE but only the RIPPLE & CLA (Carry-lookahead) is discussed here. Fig 5.3 mini_rpl: a sub-block of design RIPPLE Fig 5. Block Diagram of a Round Robin Arbiter 5.5 RIPPLE & Carry-lookahead PPE A simple Exhaustive (EXH) PPE is shown in Fig 5.2 We use this design to be the premise of a more optimal version of EXH PPE i.e. The Ripple & Carry-lookahead method. The PPE in Fig 5.3 uses N duplicate copies of a simple priority encoder (PE), and a programmable priority input to selects among N PEs. This is done via a (N+)-bit MUX with P_enc as the select signal. The rr(i) represents the series of at most N sequential operations, which is the essence of islip algorithm that guarantees a virtual queue making a request at any port would not starve beyond N rounds. First, we look at input P_enc is then Gnt(i) will be for i=p_enc and 0 otherwise. If on the other hand Request(P_enc) is 0, we look at the input numbered (P_enc+)mod N, and continue the process till either we find a non-zero request or we are back to the input number we started from (P_enc). In the latter case, none of the input request were, and so Gnt(i) will be 0 for all i, and anygnt will be 0. A sub-block (mini_rpl), as shown in Fig 5. can carry out each of these steps. P_dec is obtained by decoding the input signal P_enc. The -bit signal Imed[i] is used to indicate whether the search process has already found a (in which case Request[i] is to be ignored, Gnt[i] is 0 and Imed [I+] is ). Imed[i] is to be ignored if P_dec[i] is meaning a top priority request, and the starting point of search process. Once we have the sub-block mini_rpl, the RIPPLE PPE is simply a connection of N such sub-blocks connected back-to-back cyclically through Imed. (Figure 5.). This design is area-efficient utilizing a small number of gates in each of N stages. However, with the introduction of CLA (Carry Look-ahead) the need for combinational feed back loop is eliminated which increases the latency, undesirable in high speed arbiters. For example a CLA of m bits would reduce the rippling delay to about /6 th to the original delay. However, a detailed discussion on CLA Ripple Arbiter is beyond the scope of this study. 5.5 Discussion Fig 5.2 Block Diagram of a Exhaustive PPE There are several factors that should be kept in mind while considering a design for the arbitration process. The first, is the underlying algorithm that actually dictates the arbitration process and second a feasible VLSI implementation that would be consistent with the efficiency requirements of the 0

11 ) The transit throughput dictates that the overall capacity cannot be greater than the interconnect speed. This can be over come by increasing the line card interconnections between the routers but it adds to capitol cost and operational complexity. 2) There is a significant decrease in port density as useable ports left of the incoming traffic are slashed to a certain fraction of the total, rest being used up by the interconnects. Fig 5. Ripple Design ]routing/switching device. Several algorithms have been proposed in literature with a focus on optimization of overall throughput but most of them lack the capability of fast implementation. The tradeoffs involved in the implementation phase actually out weigh the predilection of getting 00% throughput. As discussed in the section above an arbitration schemes that guarantees 00% reliability and throughput must be carefully examined to find a way where one can meet the speed requirements while keeping throughput in a reasonable range. The symmetric triangular scheme requires lesser buses and ensures reliability but usage of redundant controllers in this manner implies wastage of resources as the secondary controllers would be idle most of the time and therefore the chessboard scheme a variant of triangular scheme focuses on optimization of controller utilization. But this comes at the expense of increase in the number of buses, which in this case sounds like a reasonable tradeoff. As from the implementation viewpoint, a major factor is that the VLSI technology is going through rapid evolution (gate density doubles every 8 months) implying that a restriction with today s technology might not be there in near future. The current CMOS process is of the order of 0.8 microns, whereas the islip arbitration module described in 5.5 was built with a premise of 0.32-micron CMOS process. The advent of high speed optical interconnects employed in very high throughput routers and switches puts a larger demand on the contemporary VLSI processes which constitute the drivers for Smart-Pixel arrays in such system. These interconnects are discussed in detail in Terabit Routing Scenarios Exponential growth of the internet and the demand for more bandwidth has pushed beyond the capacity of today s routing architectures prompting adoption of highly scalable multi terabit routing solutions. A model as it has been employed is to cluster multiple high capacity routers at the core to increase its overall throughput, which is achieved by interconnecting routers via OC-2 or OC-8, line cards. This scenario may have following shortcomings. 3) Due to the advances in DWDM, a single fiber can carry anywhere from 0 to 80 channels of 2.5 or 0 Gbps transmission paths, however to terminate these links at IP backbone routers requires large number of OC-8 or OC-92 line cards which leads to increase in number of routers at the core. Fig 6. illustrates such a scenario. The point of this discussion is to illustrate that clustering of routers via line cards is sub optimal in terms of available bandwidth and port density. OC-8 links that handle transit traffic among routers Fig 6. Router Clusters forms the Core Usable OC-8 ports There can be several ways to tackle with the ever-increasing bandwidth demands. Two of these can be: ) To build a single device with a throughput of the order of several Terabits/sec. 2) To groups several relatively smaller routers in an integrated system so that the combined throughput of the system is compatible with bandwidth requirements. However as discussed, the later suffers from aforementioned performance deficiencies. In what follows we try to present ways to overcome these with the use of totally integrated Optical backplane. But first we start with an account of the techniques presented/used so far for building high capacity VLSI Routing/Switching devices.

12 6.2 VLSI Terabit Switch Routers The Multi-Gigabit VLSI routers as presented in provides a basis for the construction of higher capacity devices. The premise of research in this area has been to capitalize on the rapid advances in CMOS VLSI technology. The CMOS crossbar switching fabric forms the basic constituent of each of these endeavors. We shall present in this section an overview of design methodologies and contrasting features of two such architectures. We begin with the Tiny Tera project as reported in [Mckeown 6]. crossbar slice is informed. The Scheduler and Crossbar slices form the centralized hub of the switch. The scheduler is connected to the I/O ports and to the slice stack through high-speed serial links operating at several Gbps. The slice stack comprises of several slices, each of which is a printed circuit board containing a -bit 32 X 32 crossbar chip. The reason for having multiple slices of crossbar is to exploit parallelism that actually provides the speed up needed to reach high throughput. It also allows for scalability as number of slices can be increased to meet varied requirement. In the discussion that follows, we intentionally skip the description of the crossbar VLSI architecture and the islipscheduling algorithm as it has been covered in the previous section. The distinguishing feature of this design is the I/O port architecture, which is optimized to fragment and store fixed sized segments. Another area that we would stress on is the serial communication between the scheduler, I/O ports and the slice stack that lends itself to flexible pipelining. The logical steps of data flow in port architecture can be summarized into following: Fig 6.2 logical switch architecture with I/O ports. 6.3 The Tiny Tera: A Packet Switch Core The Tiny Tera presents an excellent example of a compact high capacity Switch that can be employed in diverse applications as ATM switches or Internet Routers. The switch employs a similar kind of architecture as the GSR presented in. The fabric comprises of a 32 X 32 Crossbar which switches fixed-size packets across 32 I/O ports, each operating at OC 92 (0 Gbps) line rate and reaching an aggregate bandwidth of 320Gbps. The switch employs an input-queued architecture with three logical elements: Ports/ s, a central crossbar fabric and a centralized packet scheduler. Input-queuing has the advantage that the buffer memories do not require to run faster than the line rate but inherently suffer from Head of (HOL) blocking. This can be rectified by using Virtual Output Queuing (VOQ) scheme in which each input maintains a separate queue for each output. It also uses the islip-scheduling algorithm that achieves approximately 00% throughput while making scheduling decision in less than 0ns with 32-micron CMOS process. Fig 6.2 shows the typical port configuration of the switch. The packets are fragmented on arrival into fixed size 6-bits segments, which are stored in the input buffers at each port and await access to the crossbar. The centralized scheduler takes routing decision based on the current configuration of the crossbar and contents of all input queues. This decision is passed on to the concerned port and in turn the relevant ) The line cards support OC 92 links. As the packets arrive on the ports they are fragmented in 6-bit segment and distributed over several serial communication links. 2) A centralized port processor receives the segments and assigns them memory address in the on-chip SRAM. It also generates a header, which contains information about a particular segment s association with a certain packet and the source/destination pointers for a routing decision. 3) The Port Processor also communicates necessary information about all arriving packets and queue status to centralized scheduler. This communication takes place over a high capacity serial link. ) On arriving at a scheduling decision, this is conveyed to the port processor which determines the memory location from which the packet is to be de-queued and subsequently put on the crossbar links which forms a group of serial interconnects to the crossbar and corresponds to the input serial links. 5) As for all fixed-sized packet switching systems, Tiny Tera also requires some mechanism for output queuing to allow for packet assembling latencies. Since the ports are full duplex, the same on chip SRAM has to be used for both input and output queuing. This requirement however prompts the efficient usage of the memory by dynamic partitioning. Fig 6.3 illustrates the functionality of each port. The fact that this switch is designed to switch fixed sized packets 2

13 leads to the generation of additional overhead in terms of storage, additional headers and necessity of output queuing. However such a design lends itself to efficient usage of the 6-bit wide crossbar grid. It therefore ends as a tradeoff between larger overhead and efficient crossbar utilization. since the usual communication is done between a low fanout chip such as a port circuit and high fanout chip such as a crossbar, it is optimal to locate the phase adjustment and clock sync circuitry not on the higher fanout chip rather on a lower fanout one. This is to reduce complexity and lower power consumption. Fig 6. shows the link between one of the data slice chips on a port (smart end) and a crossbar slice chip (dumb end). In a multi board system such as this, performance is often limited by the distribution of high frequency clocks, and the resultant clock skew and phase noise. Therefore a careful clock distribution scheme to minimize skew and jitter is desired. The PLL circuit in figure above is used to achieve clock stability which is vital to track feedback loop drift in long feed back path. Steady and precise phase alignment is accomplished with a digitally controlled clock phase interpolator. 6. Terabit CMOS transceiver Router Fig 6.3: Block Architecture of Each Port card. The On-Chip communication presents a significant challenge to this architecture as the segmentation and reassembly of packets, centralized scheduling and high throughput requirements dictates a high degree of Control communication between ASICs. This communication also requires being sufficiently fast to maintain the control and update information of switching fabrics and status of queuing buffers to cope with throughput requirements. In this section, we detail another highly integrated, fixed length cell based terabit switch fabric [Wang ]. Fundamental to the fabric is the integration on high-speed CMOS transceiver to provide low power, low pin count and low chip-count system with a raw capacity of 2Tbps. The fabric can support user bandwidth up to 60 Gbps with a speed up factor of 2. With IQ (input queuing), OC (Output Queuing) and CIOQ (Combined input output queuing), it is necessary to buffer and retrieve data at a speed equal to or faster than the line rate. Speed up is required when DRAM random access speed cannot sustain high capacity line rates of say OC-8 or OC- 92 which is normally the case with commercially available memories. Using a multiple stage CLOS architecture [Ayer 7] can rectify this, which allows each memory device to run slower than the line rate. It was proven in [Ayer 7] that by using such an architecture a speed up of S > 2 N + can be guaranteed. Fig 6. Serial Link Architecture The first challenge in design of on chip communication infrastructure is that the crossbar and scheduler chip need to terminate more than 32 or more communication links, with each link originating from a different board. These links should achieve high data rates in a noisy digital environment. This is achieved by using a variant of traditional serial link with the phase adjustment circuitry in the transmitter instead of previously practiced regime of being in the receiver. Phase adjustment and synchronization play important roles in serial communication between two devices. In this case Fig 6.5 CLOS Like Memory Architecture for Speed up The switch in question employs a combined input and output queuing system and also provides a per class per port based QoS. The distinguishing feature of this fabric is the low power, low pin count and highly reliable CMOS 3

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located

More information

The GLIMPS Terabit Packet Switching Engine

The GLIMPS Terabit Packet Switching Engine February 2002 The GLIMPS Terabit Packet Switching Engine I. Elhanany, O. Beeri Terabit Packet Switching Challenges The ever-growing demand for additional bandwidth reflects on the increasing capacity requirements

More information

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley,

More information

Network Superhighway CSCD 330. Network Programming Winter Lecture 13 Network Layer. Reading: Chapter 4

Network Superhighway CSCD 330. Network Programming Winter Lecture 13 Network Layer. Reading: Chapter 4 CSCD 330 Network Superhighway Network Programming Winter 2015 Lecture 13 Network Layer Reading: Chapter 4 Some slides provided courtesy of J.F Kurose and K.W. Ross, All Rights Reserved, copyright 1996-2007

More information

Routers: Forwarding EECS 122: Lecture 13

Routers: Forwarding EECS 122: Lecture 13 Routers: Forwarding EECS 122: Lecture 13 epartment of Electrical Engineering and Computer Sciences University of California Berkeley Router Architecture Overview Two key router functions: run routing algorithms/protocol

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction In a packet-switched network, packets are buffered when they cannot be processed or transmitted at the rate they arrive. There are three main reasons that a router, with generic

More information

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture Generic Architecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,

More information

CSE 123A Computer Networks

CSE 123A Computer Networks CSE 123A Computer Networks Winter 2005 Lecture 8: IP Router Design Many portions courtesy Nick McKeown Overview Router basics Interconnection architecture Input Queuing Output Queuing Virtual output Queuing

More information

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC 1 Pawar Ruchira Pradeep M. E, E&TC Signal Processing, Dr. D Y Patil School of engineering, Ambi, Pune Email: 1 ruchira4391@gmail.com

More information

Routers: Forwarding EECS 122: Lecture 13

Routers: Forwarding EECS 122: Lecture 13 Input Port Functions Routers: Forwarding EECS 22: Lecture 3 epartment of Electrical Engineering and Computer Sciences University of California Berkeley Physical layer: bit-level reception ata link layer:

More information

CSCD 330 Network Programming

CSCD 330 Network Programming CSCD 330 Network Programming Network Superhighway Spring 2018 Lecture 13 Network Layer Reading: Chapter 4 Some slides provided courtesy of J.F Kurose and K.W. Ross, All Rights Reserved, copyright 1996-2007

More information

A distributed architecture of IP routers

A distributed architecture of IP routers A distributed architecture of IP routers Tasho Shukerski, Vladimir Lazarov, Ivan Kanev Abstract: The paper discusses the problems relevant to the design of IP (Internet Protocol) routers or Layer3 switches

More information

Multi-gigabit Switching and Routing

Multi-gigabit Switching and Routing Multi-gigabit Switching and Routing Gignet 97 Europe: June 12, 1997. Nick McKeown Assistant Professor of Electrical Engineering and Computer Science nickm@ee.stanford.edu http://ee.stanford.edu/~nickm

More information

The router architecture consists of two major components: Routing Engine. 100-Mbps link. Packet Forwarding Engine

The router architecture consists of two major components: Routing Engine. 100-Mbps link. Packet Forwarding Engine Chapter 4 The router architecture consists of two major components: Packet Forwarding Engine Performs Layer 2 and Layer 3 packet switching, route lookups, and packet forwarding. Routing Engine Provides

More information

Cisco Series Internet Router Architecture: Packet Switching

Cisco Series Internet Router Architecture: Packet Switching Cisco 12000 Series Internet Router Architecture: Packet Switching Document ID: 47320 Contents Introduction Prerequisites Requirements Components Used Conventions Background Information Packet Switching:

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

White Paper Enabling Quality of Service With Customizable Traffic Managers

White Paper Enabling Quality of Service With Customizable Traffic Managers White Paper Enabling Quality of Service With Customizable Traffic s Introduction Communications networks are changing dramatically as lines blur between traditional telecom, wireless, and cable networks.

More information

CS 552 Computer Networks

CS 552 Computer Networks CS 55 Computer Networks IP forwarding Fall 00 Rich Martin (Slides from D. Culler and N. McKeown) Position Paper Goals: Practice writing to convince others Research an interesting topic related to networking.

More information

TOC: Switching & Forwarding

TOC: Switching & Forwarding TOC: Switching & Forwarding Why? Switching Techniques Switch Characteristics Switch Examples Switch Architectures Summary TOC Switching Why? Direct vs. Switched Networks: n links Single link Direct Network

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

The Network Layer and Routers

The Network Layer and Routers The Network Layer and Routers Daniel Zappala CS 460 Computer Networking Brigham Young University 2/18 Network Layer deliver packets from sending host to receiving host must be on every host, router in

More information

IV. PACKET SWITCH ARCHITECTURES

IV. PACKET SWITCH ARCHITECTURES IV. PACKET SWITCH ARCHITECTURES (a) General Concept - as packet arrives at switch, destination (and possibly source) field in packet header is used as index into routing tables specifying next switch in

More information

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1,

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1, Topics for Today Network Layer Introduction Addressing Address Resolution Readings Sections 5.1, 5.6.1-5.6.2 1 Network Layer: Introduction A network-wide concern! Transport layer Between two end hosts

More information

Router Architectures

Router Architectures Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat

More information

PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS

PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS THE UNIVERSITY OF NAIROBI DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING FINAL YEAR PROJECT. PROJECT NO. 60 PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS OMARI JAPHETH N. F17/2157/2004 SUPERVISOR:

More information

LS Example 5 3 C 5 A 1 D

LS Example 5 3 C 5 A 1 D Lecture 10 LS Example 5 2 B 3 C 5 1 A 1 D 2 3 1 1 E 2 F G Itrn M B Path C Path D Path E Path F Path G Path 1 {A} 2 A-B 5 A-C 1 A-D Inf. Inf. 1 A-G 2 {A,D} 2 A-B 4 A-D-C 1 A-D 2 A-D-E Inf. 1 A-G 3 {A,D,G}

More information

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches Xike Li, Student Member, IEEE, Itamar Elhanany, Senior Member, IEEE* Abstract The distributed shared memory (DSM) packet switching

More information

Network protocols and. network systems INTRODUCTION CHAPTER

Network protocols and. network systems INTRODUCTION CHAPTER CHAPTER Network protocols and 2 network systems INTRODUCTION The technical area of telecommunications and networking is a mature area of engineering that has experienced significant contributions for more

More information

A MULTIPROCESSOR SYSTEM. Mariam A. Salih

A MULTIPROCESSOR SYSTEM. Mariam A. Salih A MULTIPROCESSOR SYSTEM Mariam A. Salih Multiprocessors classification. interconnection networks (INs) Mode of Operation Control Strategy switching techniques Topology BUS-BASED DYNAMIC INTERCONNECTION

More information

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including Router Architectures By the end of this lecture, you should be able to. Explain the different generations of router architectures Describe the route lookup process Explain the operation of PATRICIA algorithm

More information

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 CMPE 150/L : Introduction to Computer Networks Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 1 Midterm exam Midterm this Thursday Close book but one-side 8.5"x11" note is allowed (must

More information

CMSC 332 Computer Networks Network Layer

CMSC 332 Computer Networks Network Layer CMSC 332 Computer Networks Network Layer Professor Szajda CMSC 332: Computer Networks Where in the Stack... CMSC 332: Computer Network 2 Where in the Stack... Application CMSC 332: Computer Network 2 Where

More information

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni

More information

TOC: Switching & Forwarding

TOC: Switching & Forwarding TOC: Switching & Forwarding Why? Switching Techniques Switch Characteristics Switch Examples Switch Architectures Summary Why? Direct vs. Switched Networks: Single link Switches Direct Network Limitations:

More information

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach Topic 4a Router Operation and Scheduling Ch4: Network Layer: The Data Plane Computer Networking: A Top Down Approach 7 th edition Jim Kurose, Keith Ross Pearson/Addison Wesley April 2016 4-1 Chapter 4:

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Lecture 16: Network Layer Overview, Internet Protocol

Lecture 16: Network Layer Overview, Internet Protocol Lecture 16: Network Layer Overview, Internet Protocol COMP 332, Spring 2018 Victoria Manfredi Acknowledgements: materials adapted from Computer Networking: A Top Down Approach 7 th edition: 1996-2016,

More information

Optical networking technology

Optical networking technology 1 Optical networking technology Technological advances in semiconductor products have essentially been the primary driver for the growth of networking that led to improvements and simplification in the

More information

CH : 15 LOCAL AREA NETWORK OVERVIEW

CH : 15 LOCAL AREA NETWORK OVERVIEW CH : 15 LOCAL AREA NETWORK OVERVIEW P. 447 LAN (Local Area Network) A LAN consists of a shared transmission medium and a set of hardware and software for interfacing devices to the medium and regulating

More information

Chapter 4 Network Layer: The Data Plane

Chapter 4 Network Layer: The Data Plane Chapter 4 Network Layer: The Data Plane A note on the use of these Powerpoint slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you see

More information

CSE 3214: Computer Network Protocols and Applications Network Layer

CSE 3214: Computer Network Protocols and Applications Network Layer CSE 314: Computer Network Protocols and Applications Network Layer Dr. Peter Lian, Professor Department of Computer Science and Engineering York University Email: peterlian@cse.yorku.ca Office: 101C Lassonde

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

Quality of Service in the Internet

Quality of Service in the Internet Quality of Service in the Internet Problem today: IP is packet switched, therefore no guarantees on a transmission is given (throughput, transmission delay, ): the Internet transmits data Best Effort But:

More information

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES Greg Hankins APRICOT 2012 2012 Brocade Communications Systems, Inc. 2012/02/28 Lookup Capacity and Forwarding

More information

Quality of Service in the Internet

Quality of Service in the Internet Quality of Service in the Internet Problem today: IP is packet switched, therefore no guarantees on a transmission is given (throughput, transmission delay, ): the Internet transmits data Best Effort But:

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Chapter Seven Morgan Kaufmann Publishers

Chapter Seven Morgan Kaufmann Publishers Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be

More information

Chapter 4 Network Layer

Chapter 4 Network Layer Chapter 4 Network Layer Computer Networking: A Top Down Approach Featuring the Internet, 3 rd edition. Jim Kurose, Keith Ross Addison-Wesley, July 2004. Network Layer 4-1 Chapter 4: Network Layer Chapter

More information

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including Introduction Router Architectures Recent advances in routing architecture including specialized hardware switching fabrics efficient and faster lookup algorithms have created routers that are capable of

More information

Chapter 7 Hardware Overview

Chapter 7 Hardware Overview Chapter 7 Hardware Overview This chapter provides a hardware overview of the HP 9308M, HP 930M, and HP 6308M-SX routing switches and the HP 6208M-SX switch. For information about specific hardware standards

More information

Backup Exec 9.0 for Windows Servers. SAN Shared Storage Option

Backup Exec 9.0 for Windows Servers. SAN Shared Storage Option WHITE PAPER Optimized Performance for SAN Environments Backup Exec 9.0 for Windows Servers SAN Shared Storage Option 1 TABLE OF CONTENTS Executive Summary...3 Product Highlights...3 Approaches to Backup...4

More information

Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai

Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai Routers.. A router consists - A set of input interfaces at which packets arrive - A set of output interfaces from which

More information

InfiniBand SDR, DDR, and QDR Technology Guide

InfiniBand SDR, DDR, and QDR Technology Guide White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses

More information

Local Area Network Overview

Local Area Network Overview Local Area Network Overview Chapter 15 CS420/520 Axel Krings Page 1 LAN Applications (1) Personal computer LANs Low cost Limited data rate Back end networks Interconnecting large systems (mainframes and

More information

Process size is independent of the main memory present in the system.

Process size is independent of the main memory present in the system. Hardware control structure Two characteristics are key to paging and segmentation: 1. All memory references are logical addresses within a process which are dynamically converted into physical at run time.

More information

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Leaky Bucket Algorithm

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Leaky Bucket Algorithm Quality of Service in the Internet Problem today: IP is packet switched, therefore no guarantees on a transmission is given (throughput, transmission delay, ): the Internet transmits data Best Effort But:

More information

Introduction to Cisco ASR 9000 Series Network Virtualization Technology

Introduction to Cisco ASR 9000 Series Network Virtualization Technology White Paper Introduction to Cisco ASR 9000 Series Network Virtualization Technology What You Will Learn Service providers worldwide face high customer expectations along with growing demand for network

More information

Tag Switching. Background. Tag-Switching Architecture. Forwarding Component CHAPTER

Tag Switching. Background. Tag-Switching Architecture. Forwarding Component CHAPTER CHAPTER 23 Tag Switching Background Rapid changes in the type (and quantity) of traffic handled by the Internet and the explosion in the number of Internet users is putting an unprecedented strain on the

More information

EE 122: Router Design

EE 122: Router Design Routers EE 22: Router Design Kevin Lai September 25, 2002.. A router consists - A set of input interfaces at which packets arrive - A set of output interfaces from which packets depart - Some form of interconnect

More information

Introduction to Routers and LAN Switches

Introduction to Routers and LAN Switches Introduction to Routers and LAN Switches Session 3048_05_2001_c1 2001, Cisco Systems, Inc. All rights reserved. 3 Prerequisites OSI Model Networking Fundamentals 3048_05_2001_c1 2001, Cisco Systems, Inc.

More information

Quality of Service (QoS)

Quality of Service (QoS) Quality of Service (QoS) The Internet was originally designed for best-effort service without guarantee of predictable performance. Best-effort service is often sufficient for a traffic that is not sensitive

More information

On Scheduling Unicast and Multicast Traffic in High Speed Routers

On Scheduling Unicast and Multicast Traffic in High Speed Routers On Scheduling Unicast and Multicast Traffic in High Speed Routers Kwan-Wu Chin School of Electrical, Computer and Telecommunications Engineering University of Wollongong kwanwu@uow.edu.au Abstract Researchers

More information

IP Video Network Gateway Solutions

IP Video Network Gateway Solutions IP Video Network Gateway Solutions INTRODUCTION The broadcast systems of today exist in two separate and largely disconnected worlds: a network-based world where audio/video information is stored and passed

More information

Module 1. Introduction. Version 2, CSE IIT, Kharagpur

Module 1. Introduction. Version 2, CSE IIT, Kharagpur Module 1 Introduction Version 2, CSE IIT, Kharagpur Introduction In this module we shall highlight some of the basic aspects of computer networks in two lessons. In lesson 1.1 we shall start with the historical

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

CSC 4900 Computer Networks: Network Layer

CSC 4900 Computer Networks: Network Layer CSC 4900 Computer Networks: Network Layer Professor Henry Carter Fall 2017 Villanova University Department of Computing Sciences Review What is AIMD? When do we use it? What is the steady state profile

More information

Memory. Objectives. Introduction. 6.2 Types of Memory

Memory. Objectives. Introduction. 6.2 Types of Memory Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts

More information

A 400Gbps Multi-Core Network Processor

A 400Gbps Multi-Core Network Processor A 400Gbps Multi-Core Network Processor James Markevitch, Srinivasa Malladi Cisco Systems August 22, 2017 Legal THE INFORMATION HEREIN IS PROVIDED ON AN AS IS BASIS, WITHOUT ANY WARRANTIES OR REPRESENTATIONS,

More information

Routers Technologies & Evolution for High-Speed Networks

Routers Technologies & Evolution for High-Speed Networks Routers Technologies & Evolution for High-Speed Networks C. Pham Université de Pau et des Pays de l Adour http://www.univ-pau.fr/~cpham Congduc.Pham@univ-pau.fr Router Evolution slides from Nick McKeown,

More information

Topic & Scope. Content: The course gives

Topic & Scope. Content: The course gives Topic & Scope Content: The course gives an overview of network processor cards (architectures and use) an introduction of how to program Intel IXP network processors some ideas of how to use network processors

More information

Switching. An Engineering Approach to Computer Networking

Switching. An Engineering Approach to Computer Networking Switching An Engineering Approach to Computer Networking What is it all about? How do we move traffic from one part of the network to another? Connect end-systems to switches, and switches to each other

More information

Professor Yashar Ganjali Department of Computer Science University of Toronto.

Professor Yashar Ganjali Department of Computer Science University of Toronto. Professor Yashar Ganjali Department of Computer Science University of Toronto yganjali@cs.toronto.edu http://www.cs.toronto.edu/~yganjali Today Outline What this course is about Logistics Course structure,

More information

Computer Networks. Instructor: Niklas Carlsson

Computer Networks. Instructor: Niklas Carlsson Computer Networks Instructor: Niklas Carlsson Email: niklas.carlsson@liu.se Notes derived from Computer Networking: A Top Down Approach, by Jim Kurose and Keith Ross, Addison-Wesley. The slides are adapted

More information

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [NETWORKING] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Why not spawn processes

More information

Chapter 4. Computer Networking: A Top Down Approach 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, sl April 2009.

Chapter 4. Computer Networking: A Top Down Approach 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, sl April 2009. Chapter 4 Network Layer A note on the use of these ppt slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you can add, modify, and delete

More information

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097 Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Master Course Computer Networks IN2097 Prof. Dr.-Ing. Georg Carle Christian Grothoff, Ph.D. Chair for

More information

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21 100 GBE AND BEYOND 2011 Brocade Communications Systems, Inc. Diagram courtesy of the CFP MSA. v1.4 2011/11/21 Current State of the Industry 10 Electrical Fundamental 1 st generation technology constraints

More information

Lecture 16: Router Design

Lecture 16: Router Design Lecture 16: Router Design CSE 123: Computer Networks Alex C. Snoeren Eample courtesy Mike Freedman Lecture 16 Overview End-to-end lookup and forwarding example Router internals Buffering Scheduling 2 Example:

More information

Managed IP Services from Dial Access to Gigabit Routers

Managed IP Services from Dial Access to Gigabit Routers Managed IP Services from Dial Access to Gigabit Routers Technical barriers and Future trends for IP Differentiated Services Grenville Armitage, PhD Member of Technical Staff High Speed Networks Research,

More information

Network Layer Introduction

Network Layer Introduction Network Layer Introduction Tom Kelliher, CS 325 Apr. 6, 2011 1 Administrivia Announcements Assignment Read 4.4. From Last Time Congestion Control. Outline 1. Introduction. 2. Virtual circuit and datagram

More information

Computer Networks LECTURE 10 ICMP, SNMP, Inside a Router, Link Layer Protocols. Assignments INTERNET CONTROL MESSAGE PROTOCOL

Computer Networks LECTURE 10 ICMP, SNMP, Inside a Router, Link Layer Protocols. Assignments INTERNET CONTROL MESSAGE PROTOCOL Computer Networks LECTURE 10 ICMP, SNMP, Inside a Router, Link Layer Protocols Sandhya Dwarkadas Department of Computer Science University of Rochester Assignments Lab 3: IP DUE Friday, October 7 th Assignment

More information

Adaptive Resync in vsan 6.7 First Published On: Last Updated On:

Adaptive Resync in vsan 6.7 First Published On: Last Updated On: First Published On: 04-26-2018 Last Updated On: 05-02-2018 1 Table of Contents 1. Overview 1.1.Executive Summary 1.2.vSAN's Approach to Data Placement and Management 1.3.Adaptive Resync 1.4.Results 1.5.Conclusion

More information

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Internetworking Multiple networks are a fact of life: Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Fault isolation,

More information

The Network Processor Revolution

The Network Processor Revolution The Network Processor Revolution Fast Pattern Matching and Routing at OC-48 David Kramer Senior Design/Architect Market Segments Optical Mux Optical Core DWDM Ring OC 192 to OC 768 Optical Mux Carrier

More information

Routing, Routers, Switching Fabrics

Routing, Routers, Switching Fabrics Routing, Routers, Switching Fabrics Outline Link state routing Link weights Router Design / Switching Fabrics CS 640 1 Link State Routing Summary One of the oldest algorithm for routing Finds SP by developing

More information

A General Purpose Queue Architecture for an ATM Switch

A General Purpose Queue Architecture for an ATM Switch Mitsubishi Electric Research Laboratories Cambridge Research Center Technical Report 94-7 September 3, 994 A General Purpose Queue Architecture for an ATM Switch Hugh C. Lauer Abhijit Ghosh Chia Shen Abstract

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Optical Packet Switching

Optical Packet Switching Optical Packet Switching DEISNet Gruppo Reti di Telecomunicazioni http://deisnet.deis.unibo.it WDM Optical Network Legacy Networks Edge Systems WDM Links λ 1 λ 2 λ 3 λ 4 Core Nodes 2 1 Wavelength Routing

More information

CCNA Exploration Network Fundamentals. Chapter 09 Ethernet

CCNA Exploration Network Fundamentals. Chapter 09 Ethernet CCNA Exploration Network Fundamentals Chapter 09 Ethernet Updated: 07/07/2008 1 9.0.1 Introduction 2 9.0.1 Introduction Internet Engineering Task Force (IETF) maintains the functional protocols and services

More information

ECE 551 System on Chip Design

ECE 551 System on Chip Design ECE 551 System on Chip Design Introducing Bus Communications Garrett S. Rose Fall 2018 Emerging Applications Requirements Data Flow vs. Processing µp µp Mem Bus DRAMC Core 2 Core N Main Bus µp Core 1 SoCs

More information

CSC 401 Data and Computer Communications Networks

CSC 401 Data and Computer Communications Networks CSC 401 Data and Computer Communications Networks Network Layer Overview, Router Design, IP Sec 4.1. 4.2 and 4.3 Prof. Lina Battestilli Fall 2017 Chapter 4: Network Layer, Data Plane chapter goals: understand

More information

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097 Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Master

More information

Implementation of a leaky bucket module for simulations in NS-3

Implementation of a leaky bucket module for simulations in NS-3 Implementation of a leaky bucket module for simulations in NS-3 P. Baltzis 2, C. Bouras 1,2, K. Stamos 1,2,3, G. Zaoudis 1,2 1 Computer Technology Institute and Press Diophantus Patra, Greece 2 Computer

More information

CS 426 Parallel Computing. Parallel Computing Platforms

CS 426 Parallel Computing. Parallel Computing Platforms CS 426 Parallel Computing Parallel Computing Platforms Ozcan Ozturk http://www.cs.bilkent.edu.tr/~ozturk/cs426/ Slides are adapted from ``Introduction to Parallel Computing'' Topic Overview Implicit Parallelism:

More information

2. LAN Topologies Gilbert Ndjatou Page 1

2. LAN Topologies Gilbert Ndjatou Page 1 2. LAN Topologies Two basic categories of network topologies exist, physical topologies and logical topologies. The physical topology of a network is the cabling layout used to link devices. This refers

More information

DESIGN AND IMPLEMENTATION OF AN AVIONICS FULL DUPLEX ETHERNET (A664) DATA ACQUISITION SYSTEM

DESIGN AND IMPLEMENTATION OF AN AVIONICS FULL DUPLEX ETHERNET (A664) DATA ACQUISITION SYSTEM DESIGN AND IMPLEMENTATION OF AN AVIONICS FULL DUPLEX ETHERNET (A664) DATA ACQUISITION SYSTEM Alberto Perez, Technical Manager, Test & Integration John Hildin, Director of Network s John Roach, Vice President

More information

Traditional network management methods have typically

Traditional network management methods have typically Advanced Configuration for the Dell PowerConnect 5316M Blade Server Chassis Switch By Surendra Bhat Saurabh Mallik Enterprises can take advantage of advanced configuration options for the Dell PowerConnect

More information

Cisco IOS Switching Paths Overview

Cisco IOS Switching Paths Overview This chapter describes switching paths that can be configured on Cisco IOS devices. It contains the following sections: Basic Router Platform Architecture and Processes Basic Switching Paths Features That

More information

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview Chapter 4: chapter goals: understand principles behind services service models forwarding versus routing how a router works generalized forwarding instantiation, implementation in the Internet 4- Network

More information