PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS

Similar documents
ECE 158A: Lecture 7. Fall 2015

Lecture 3. The Network Layer (cont d) Network Layer 1-1

Lecture 8. Network Layer (cont d) Network Layer 1-1

The Network Layer and Routers

Router Architectures

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1,

CSCE 463/612 Networks and Distributed Processing Spring 2018

EE 122: Differentiated Services

2/22/2008. Outline Computer Networking Lecture 9 IP Protocol. Hop-by-Hop Packet Forwarding in the Internet. Internetworking.

Register Bit Name Description Default Global Ctrl Reg 2 SGCR2. Table 1. Registers are used for Common and Egress Port Setting

Performance Evaluation of Scheduling Mechanisms for Broadband Networks

H3C S9500 QoS Technology White Paper

Professor Yashar Ganjali Department of Computer Science University of Toronto.

Router Architecture Overview

Configuring QoS CHAPTER

The Network Layer. Antonio Carzaniga. April 22, Faculty of Informatics University of Lugano Antonio Carzaniga

Basics (cont.) Characteristics of data communication technologies OSI-Model

1-1. Switching Networks (Fall 2010) EE 586 Communication and. October 25, Lecture 24

CSC 401 Data and Computer Communications Networks

GUARANTEED END-TO-END LATENCY THROUGH ETHERNET

Network Layer: Control/data plane, addressing, routers

Chapter 4 Network Layer: The Data Plane

Principles. IP QoS DiffServ. Agenda. Principles. L74 - IP QoS Differentiated Services Model. L74 - IP QoS Differentiated Services Model

Introduction to Routers and LAN Switches

internet technologies and standards

Network Layer PREPARED BY AHMED ABDEL-RAOUF

Queuing Disciplines. Order of Packet Transmission and Dropping. Laboratory. Objective. Overview

Chapter 5 OSI Network Layer

CS118 Discussion, Week 6. Taqi

Problems with IntServ. EECS 122: Introduction to Computer Networks Differentiated Services (DiffServ) DiffServ (cont d)

Quality of Service (QoS)

II. Principles of Computer Communications Network and Transport Layer

Lecture 3: Packet Forwarding

Chapter 4: Network Layer

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview

Configuring QoS CHAPTER

COMP/ELEC 429/556 Introduction to Computer Networks

Networking: Network layer

Network layer: Overview. Network layer functions IP Routing and forwarding NAT ARP IPv6 Routing

Network Management & Monitoring

Routers. Session 12 INST 346 Technologies, Infrastructure and Architecture

Routers Technologies & Evolution for High-Speed Networks

Internet Protocols (chapter 18)

Marking Traffic CHAPTER

Network layer: Overview. Network Layer Functions

CPSC 826 Internetworking. The Network Layer: Routing & Addressing Outline. The Network Layer

COMP211 Chapter 4 Network Layer: The Data Plane

Configuring QoS CHAPTER

Chapter 18. Introduction to Network Layer

Internet Protocol version 6

QoS for Real Time Applications over Next Generation Data Networks

Cisco IOS Switching Paths Overview

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Routers: Forwarding EECS 122: Lecture 13

Introduction to Internetworking

CSC 4900 Computer Networks: Network Layer

Lecture 4 - Network Layer. Transport Layer. Outline. Introduction. Notes. Notes. Notes. Notes. Networks and Security. Jacob Aae Mikkelsen

Internetworking Terms. Internet Structure. Internet Structure. Chapter 15&16 Internetworking. Internetwork Structure & Terms

CS4450. Computer Networks: Architecture and Protocols. Lecture 13 THE Internet Protocol. Spring 2018 Rachit Agarwal

ECE 4450:427/527 - Computer Networks Spring 2017

Configuring QoS. Finding Feature Information. Prerequisites for QoS

ELEC / COMP 177 Fall Some slides from Kurose and Ross, Computer Networking, 5 th Edition

Configuring QoS. Understanding QoS CHAPTER

Data Communication & Networks G Session 7 - Main Theme Networks: Part I Circuit Switching, Packet Switching, The Network Layer

Communications Software. CSE 123b. CSE 123b. Spring Lecture 2: Internet architecture and. Internetworking. Stefan Savage

Chapter 4 Network Layer: The Data Plane

Sections Describing Standard Software Features

End-to-End Communication

IPv6 is Internet protocol version 6. Following are its distinctive features as compared to IPv4. Header format simplification Expanded routing and

CS 556 Advanced Computer Networks Spring Solutions to Midterm Test March 10, YOUR NAME: Abraham MATTA

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

SEN366 (SEN374) (Introduction to) Computer Networks

Where we are in the Course

Master Course Computer Networks IN2097

Lecture 16: Network Layer Overview, Internet Protocol

Ref: A. Leon Garcia and I. Widjaja, Communication Networks, 2 nd Ed. McGraw Hill, 2006 Latest update of this lecture was on

Master Course Computer Networks IN2097

Network Layer: Router Architecture, IP Addressing

The Interconnection Structure of. The Internet. EECC694 - Shaaban

Routers: Forwarding EECS 122: Lecture 13

Order of Packet Transmission and Dropping

Network Layer: outline

Introduction CHAPTER 1

Internetwork Protocols

CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

EEC-484/584 Computer Networks

IP - The Internet Protocol. Based on the slides of Dr. Jorg Liebeherr, University of Virginia

Sections Describing Standard Software Features

DiffServ Architecture: Impact of scheduling on QoS

This talk will cover the basics of IP addressing and subnetting. Topics covered will include:

FPX Architecture for a Dynamically Extensible Router

The Internet Protocol (IP)

Department of Computer and IT Engineering University of Kurdistan. Network Layer. By: Dr. Alireza Abdollahpouri

Modular Quality of Service Overview on Cisco IOS XR Software

Vorlesung Kommunikationsnetze

Network Security Fundamentals. Network Security Fundamentals. Roadmap. Security Training Course. Module 2 Network Fundamentals

Internet Quality of Service: an Overview

Advanced Computer Networks

Lecture 8. Basic Internetworking (IP) Outline. Basic Internetworking (IP) Basic Internetworking (IP) Service Model

Transcription:

THE UNIVERSITY OF NAIROBI DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING FINAL YEAR PROJECT. PROJECT NO. 60 PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS OMARI JAPHETH N. F17/2157/2004 SUPERVISOR: DR. G.S.O ODHIAMBO EXAMINER: DR. OUMA H. ABSOLOMS

PROJECT OBJECTIVES To design and analyze a parallel lookup algorithm that can work on routers. To model a suitable simulation of the algorithm and investigate if it improves the router throughput. INTRODUCTION Internet traffic continues to increase rapidly year by year. The explosive growth of the Internet in number of users and service variety is parallel to the growth in transmission links capacity due to the advances in fibre optic bandwidth that has created huge supply of wide area network (WAN) bandwidth. New applications (e.g., the Web, video conferencing, remote imaging) have higher bandwidth needs than traditional applications. Therefore, further increases in users, hosts, domains, and traffic is expected. To provide adequate internet performance, communication links in the internet backbone are being upgraded with high speed fibre optic links. Even Gigabit links are insufficient unless Internet routers are able to forward packets at Gigabit rates. When an Internet router gets a packet P from an input link interface, it uses the destination address in packet P to lookup a routing database. The result of the lookup provides an output link interface, to which packet P is forwarded. There is some additional bookkeeping such as updating packet headers, but the major tasks in packet forwarding are address lookup and switching packets between link interfaces Address lookup is thus a major bottleneck in routers high speed forwarding. Therefore high speed IP routers and switches that forward exponentially increasing volume of traffic and high speeds are required. Currently core routers operate at multi gigabit or terabit speeds. Parallel processing i.e. parallel IP lookups can increase the lookup rates.

COMPONENTS OF A ROUTER A router has four components: input ports, output ports, a switching fabric, and a routing processor. An input port is the point of attachment for a physical link and is the point of entry for incoming packets. The switching fabric interconnects input ports with output ports. The routing processor participates in routing protocols and creates a forwarding table that is used in packet forwarding.

THE PACKET FORWARDING PROCESS The IP packet processing steps are as follows: 1. IP Header Validation: As a packet enters an ingress port, the forwarding logic verifies all Layer 3 information (header length, packet length, protocol version, checksum, etc.). 2. Route Lookup and Header Processing: The router then performs an IP address lookup using the packet s destination address to determine the egress (or outbound) port, and performs all IP forwarding operations (TTL decrement, header checksum, etc.). 3. Packet Classification: In addition to examining the Layer 3 information, the forwarding engine examines Layer 4 and higher layer packet attributes relative to any QoS and access control policies. 4. With the Layer 3 and higher layer attributes in hand, the forwarding engine performs one or more parallel functions: associates the packet with the appropriate priority and the appropriate egress port(s) (an internal routing tag provides the switch fabric with the appropriate egress port information, the QoS priority queue the packet is to be stored in, and the drop priority for congestion control), redirects the packet to a different destination (ICMP redirect), drops the packet according to a congestion control policy (e.g., Random Early Detection (RED)), or a security policy 5. The forwarding engine notifies the system controller that a packet has arrived. 6. The system controller reserves a memory location for the arriving packet. 7. Once the packet has been passed to the shared memory, the system controller signals the appropriate egress port(s). For multicast traffic, multiple egress ports are signalled. 8. The egress port(s) extracts the packet from the known shared memory location using an appropriate algorithms: e.g. Weighted Fair Queuing (WFQ. 9. When the destination egress port(s) has retrieved the packet, it notifies the system controller, and the memory location is made available for new traffic.

IP VERSION 4 PACKET FORMAT Version number specifies the version of the IP protocol and determines packet format. version 6 is defined; similar to version 4 but longer address Header Length (HLen) gives number of 32bit words in header. Type of Service (TOS) field is infrequently used but allows for application specific treatment of packets. Fragmentation Identifier, flags and Offset use for fragmentation and reassembly of IP packets. Time to live (TTL) specifies the remaining number of hops before packet should be discarded. Prevents infinite looping of packets Protocol used for demultiplexing at destination. Checksum for detecting errors on end to end basis. Address fields specify source and destination. Hierarchical address structure for large scale internet. Options are rarely used but must be supported in complete IP protocol implementations. Classless Inter domain Routing (CIDR) extends subnet idea where arbitrary address prefixes can be used to represent a set of addresses that can be treated as a group, for purposes of forwarding packets.

PARALLEL IP ROUTING TABLE LOOKUP ALGORITHM PSEUDOCODE Input: Destination IP Address (Abbreviated as DIP) Output: Next Hop IP Address (Abbreviated as NIP) Step 1: Input destination IP Address Step 2: Use the ID bits of DIPs to allocate them to different memory blocks then push them into firstin first out buffers (FIFOs) of the corresponding memory modules. Step 3: For every FIFO buffer for each memory unit Mi parallelly do While (true) do Begin If FIFO is empty then continue; Else Pop a DIP from local FIFO; End; while (true) do Step 4: Lookup For each DIP popped out of the FIFOs of the memory modules do Use the ID to associate the DIP with its corresponding NIP in every memory unit; Pop the NIP from the memory module; Push the NIP to the output buffer; Step 5: Pop the NIP as the result; Step 6: Stop

ALGORITHM FLOW CHART

QUEUING ANALYSIS Queuing theory is used to model the FIFO queues in the lookup subsystem. Assume that the arriving process of the incoming IP addresses is a Poisson process with average arrival rate λ. The service type of lookup is a Poisson process with service rate μ and service time, T = 1/μ s. M/M/8/K queue model is used to represent the system. This is Kendall notation where the first letter, M denotes the distribution of the inter arrival time, the second M denotes the distribution of the service time, The third number, 8 denotes the number of servers, K denotes the maximum size of the waiting queue. M (Markov) denotes the exponential distribution. The letter M stems from the fact that the exponential distribution is the only continuous distribution with the Markov property, i.e. it is memoryless. Evidently, the average arriving rate of each M/M/1/K queue reduces to λ /8 while the service rate is still μ. Then we use the classic queuing theory to solve the model and calculate the corresponding parameters as follows: Average number of destination IP addresses in each memory module is given by; L q = λ 2 / µ(µ λ); where L q is the average number of packet headers in a queue. Average delay time, W q in a queue is given by W q = λ/ µ(µ λ);

According to the analysis presented above, the graph shows the running average queuing delay time as a function of the traffic intensity ρ. The graph shows that the queuing delay will not grow up sharply as long as ρ remains under 0.6.

LOOKUP PROCESS The routing table is modelled by microsoft access database containing a list of prefixes and their corresponding Next Hop Addresses. The parallel memory units are modelled by eight tables in the routing database where each table is stored in a single memory unit Lookup is effected by searching for an entry that matches a prefix obtained from the first octet of the IP (version four) address input into the system as a Destination IP Address (DIP). ID (bit 16, 24 & 32) 000 001 010 011 100 101 110 111 Memory Unit 1 2 3 4 5 6 7 8 The table shows the criterion used to allocate the destination IP addresses to different memory blocks. Bits 16, 24, and 32 are used as the first, second and third ID bits respectively. The search operation is implemented in visual basic programming language. The database is accessed using the an ActiveX Data Object (ADO) tool referred to as adodc. ADO is a languageneutral object model that lets you manipulate data accessed by an underlying OLE DB provider. (An OLE DB provider is a data manager that interfaces directly with a database. ADO s Recordset object contains records returned from a database plus the cursor for those records. This tool links the search code to the database. Every memory block is linked differently so that the search operation in one memory block is independent of the other units. The search operation in the in individual memory units is absolute. This implies that at a given time instant, a packet that arrives at the system input is searched for only in its corresponding memory block. Thus for every packet arrival, only one of the eight memory blocks that meets the criterion in table 3.1 will be searched.

RESULTS

RESULTS

RESULTS

ANALYSIS The forwarding table is sub divided into eight tables stored in separate parallel accessible memory blocks. This parallelism minimizes memory contention hence IP lookups are faster. COMPLEXITY ANALYSIS Given that routing tables for units 1, 2, 3, 4, 5, 6, 7 & 8 have N 1, N 2, N 3, N 4, N 5, N 6, N 7, & N 8 entries respectively, the lookup algorithm complexity for the units are O(N 1 ), O(N 2 ), O(N 3 ), O(N 4 ), O(N 5 ), O(N 6 ), O(N 7 )& O(N 8 ) respectively in the worst case. In this design N 1 = N 2 = N 3 = N 4 = N 5 = N 6,= N 7 = N 8 = 60. Thus a lookup in one unit terminates in 60 comparisons. The algorithm was employed in a model network simulated by packet tracer software. The parameters of router B have used in this analysis. It is a Cisco 3600 router with an Embedded Services Processor with 10 packet processor elements at clock rate of 1 GHZ (PPEs).It has 32 MB of 50 ns DRAM. The lookup process is a memory read operation. With the 1 GHZ processor (i.e. 1 ns clock), the 50 ns DRAM can perform the first read in 50 clock cycles. If the full bandwidth the gigabit link is utilized the maximum rate of incoming packets is 1 Gbps. Each memory lookup unit is assigned a single packet processor element. Thus a single comparison takes 50 ns. In the worst case, the lookup operation takes (50 x N) ns, where N = 60 giving a lookup time of 3 µs. Thus the throughput is: 1/ (50 x60) = 333333.3 packets/second = 333333.3 x 32 since a packet is 32 bits long = 10.666656 Mbps per memory in the worst case. Thus the total throughput of the system will be 10.666656 x 8 = 85.3333 Mbps

MODEL NETWORK A model network simulated by packet tracer software

ANALYSIS (CONTD) In the best case, the algorithm complexity reduces to O (1) for every memory unit. This implies that the algorithm can terminate in only one comparison. The best case occurs when the lookup procedure matches the first entry in the routing table. With the 50 ns DRAM, the first read is done in 50 clock cycles. 1GHZ clock rate (=1 ns clock), the read operation takes 50ns. Thus the throughput for one memory unit will be 1/50ns = 20Mpackets/second giving a throughput of 20 x 8 x 32 = 5.12 Gbps for the whole system. Thus with parallel processing, the work load i.e. IP lookup is now shared between eight different processors as opposed to shared memory systems have single global memory which is accessed by all processors. When many processors are making simultaneous requests to a single memory location or bank, and memory access becomes a bottleneck, access times can increase greatly. Because of limitations on processor to memory bandwidth, performance suffers when too many of these processes attempt to access the same memory location simultaneously. For this reason, this design employed parallel physical memory layout to ensure that the memory system can handle as many simultaneous requests as possible. Different processors are assigned different memory units so that the lookup process can take place in separate units concurrently.

CONCLUSION In this design the major bottleneck in high speed packet forwarding, routing table lookup has been analysed. A simple parallel algorithm has been proposed that can speed up the lookup process hence improve the router throughput. The algorithm has been analysed in relation to a network simulated by packet tracer software. It has been shown that the router throughput has been significantly increased. Faster lookups imply higher router throughput (i.e. the rates that the packets are transferred for the input interface to the appropriate output interface measured in number of bits transferred per second). This is because routing table lookups take the largest share of the router resources, i.e. memory access time and processing speed, in the packet forwarding process. Thus speeding up the lookup process has a significant effect of improving the router throughput. RECOMMENDATIONS FOR FURTHER WORK In future this work can be extended by investigating other aspects of parallel processing. For instance the criterion to split the routing table entries to separate memory modules should be implemented in the routing processor. The routing table should also be implemented as a trie based structure as opposed to the access database used in this work. THANK YOU.