OPNET Modeling of an IP Router with Scheduling Algorithms to Implement Differentiated Services Hiroshi Yamada NTT Service Integration Laboratories, Communication Traffic Project, Traffic Solution Group 3-9-11, Midori-cho, Musashino-shi, Tokyo, 180-8585, Japan tomoka@hashi.tnl.ntt.co.jp ABSTRACT Enterprises are increasingly depending on their networks to carry mission-critical applications. To ensure the performance of such applications, quality of service (QoS) control is required and routers with QoS functions should be implemented in networks. This paper describes the OPNET modeling of an IP router with two scheduling algorithms: custom queuing (CQ) and priority queuing (PQ). A processor module for executing the scheduling algorithms is added between the IP queue module and the point-to-point transmitter module in the IP router. I did two simulations using the developed CQ and PQ routers: one for the network model where the two offices are connected by a WAN link, and one for the network model which to the voice over IP client is connected. Key words: IP router, Differentiated services, QoS, Custom Queuing, Priority Queuing. 1. INTRODUCTION Enterprises are increasingly depending on their networks to carry mission-critical applications. The quality of service (QoS) provided by these networks must be controlled to ensure the performance of such applications. According to a white paper [1] by Cisco Systems, the basic framework for providing QoS in a network includes QoS within a single network element (queuing, scheduling, traffic shaping, etc.) signaling techniques (for example, RSVP) for coordinating the end-to-end QoS between network elements policy, management, and accounting functions (for example, policy routing) to control and administer the end-to-end QoS of traffic across the network. These three components of the basic QoS framework are necessary to ensure the QoS across a network comprising heterogeneous technologies (IP, ATM, Frame Relay, and so on). Furthermore, there are three basic levels of end-to-end QoS service: best effort service (lack of QoS) is basic connectivity with no guarantees; differentiated service (Soft QoS) is treated better than best effort service, but only as a statistical preference, not a hard and fast guarantee; guaranteed service (Hard QoS) is the absolute reservation of network resources for special traffic. In this paper, I will concentrate on the first component of the basic QoS framework. I will discuss QoS within a single network element, and explain how I used OPNET TM modeling to add two scheduling algorithms to the IP router to implement second-level end-to-end QoS service, i.e., differentiated service. This paper is organized as follows. In Section 2, I describe two priority-based algorithms for scheduling traffic to be sent over a point-to-point link: custom queuing (CQ) and priority queuing (PQ). In Section 3, I explain how I incorporated these algorithms into the OPNET TM IP-router model. I expanded the net_app_mgr process in the GNA model to add ON-OFF voice traffic, which represents the voice-packet-arrival process. The process and node models of the IP router and voice traffic were coded using OPNET TM version 4.0.A. In Section 4, I present the results of two simulations, using these process models. In this paper, variables and the names of process models are shown in italics. 2. QUEUING SCHEDULING ALGORITHMS I investigated the use of two queuing scheduling algorithms to handle traffic in an IP router: priority queuing (PQ) and custom queuing (CQ)[1],[2],[3]. The PQ scheme assigns a strict priority to important traffic. It flexibly prioritizes traffic according to network protocol, the incoming interface, the packet size, the source/destination address, and so on. In the model I developed, priority is assigned according to the destination port number of the TCP or UDP packet enveloped in the IP packet; the port number corresponds to an application. As shown in Fig.1, each arriving packet is placed in one of four subqueues (High, Medium, Normal, or Low) based on a priority that has been assigned. During packet transmission, the scheduling algorithm gives the higher-priority queues 1
absolute preference over the lower-priority queues. This algorithm is simple, but can cause long queuing delays and jitter in the low-priority traffic. The CQ scheme is used to enable various applications with different minimum bandwidth or latency requirements to share the network. It provides a guaranteed bandwidth at potential congestion points, ensuring specified traffic of a fixed portion of the available bandwidth and leaving the residual bandwidth to the other traffic. As shown in Fig.2, each class of packets is assigned to a specific subqueue, and the router uses the CQ algorithm to dequeues the packets in a weighted round-robin fashion. Generally, packets are classified according to the protocol and the incoming interface, and so on. In the developed CQ model, the classification rule is the same as with the developed PQ scheme. Both the PQ and CQ algorithms are statistically configured and do not automatically adapt to changing network conditions. High Traffic destined for interface Classify Classification Medium Normal Low Manage interface buffer resources Absolute priority scheduling Transmit queue Output hardware Allocate link bandwidth by strict priority 3. OPNET TM Modeling of IP Router with CQ or PQ control scheme 3.1 Node Model The standard OPNET TM IP router model consists of several basic modules: routing protocol processor module ( rip, ospf ), IP encapsulation and decapsulation processor module ( ip_encap ), IP queue module ( ip ), IP ARP processor module ( arp_1(2) ), data-link-layer protocol processor or queue module, which implements the data-link-layer protocols, for example Ethernet and frame relay ( mac_1(2), FRIPIF, FRAD, etc.), and transmitter and receiver modules ( pr_*_*, pt_*_*, hub_rx_*_* ). The * means an arbitrary index number. To allocate the bandwidth of the point-to-point output link based on a strict priority or configured proportion of the bandwidth by using the queuing scheduling algorithm, I added the queuing module, cq_wire_* between the ip queue module and the point-to-point transmitter module, as shown in Fig.3. Furthermore, a statistic wire connects the point-to-point transmitter to the queuing module. This wire carries information indicating the change of state of the point-to-point transmitter, i.e., from busy to idle. When an IP packet is transferred to the output link from the point-topoint transmitter, the state of the transmitter is changed and a statistical interrupt is sent to the queuing module. The IP router model with the PQ control scheme has the same structure, except that each process model in the added queuing module is different. Figure 1. Priority queuing (PQ) Traffic destined for interface Class 1 (30% Class 2 (15%) Transmit queue Classification Classify Class 3 (15%) Class N (5%) Manage interface buffer resources Weighted round-robin (bytes counts) Figure 2. Custom queuing (CQ) Output hardware Allocate configured proportion of link bandwidth Figure 3. Node model for IP router with custom queuing (CQ) control scheme. In this router model, an arriving IP packet is received at the receiver module and transferred to the IP module. The interface used to go to the next node is determined by the routing functions in the IP module. When the determined interface is a point-to-point transmitter with 2
a queuing scheduling scheme, the packet is transferred to the added queuing module, where the packet is placed to the appropriate subqueue, then dequeued according to the queuing scheduling algorithm. The IP packet is then transferred to the point-to-point transmitter. 3.2 Process Model The state diagram of the process model in the queuing module is shown in Fig.4. I expanded and modified the process model, acb_fifo, into one that represents the CQ or PQ scheduling algorithm by adding two functions. Figure 4. Process model using custom queuing (CQ) control scheme in cq_wire_* module. In arrival state, a classification function is added. It worksasfollows. Get IP packet from packet stream and check remote port number of encapsulated TCP or UDP packet. According to the configured classification table information, compare the remote port number with the contents of the above table, then determine the subqueue index to insert into the arriving packet. To define the classification table of CQ control scheme, input number of classes to classify, port number, and proportion of the output link bandwidth for each class. For PQ, input port number and priority class. The number of subqueues and its capacity is defined by using subqueue attribute window. The number of subqueues must be the same as the number of classes to classify in CQ control scheme. In svc_start state, a dequeuing function for the CQ or PQ scheduling algorithm is added. I use the following notation. subqid: index of subqueue from which the next packet is dequeued. cur_subq: index of subqueue from which last packet was dequeued. start_id: index of subqueue to start checking whether each subqueue is empty or not. counter_value: shows total bytes sent from subqueue in this round of round-robin scheduling. threshold_value: maximum threshold of total bytes that can be sent in one round; it is calculated using bandwidth of output link and proportion of bandwidth for each class defined in classification table. loop: number of loops checking whether of start_id subqueue is empty or not; initial value is 0. The basic procedure of the dequeuing function in the CQ scheduling algorithm is as follows. This function derives the value subqid from the value cur_subq. Step.1 Check whether cur_subq subqueue is empty or not. If it is, start_id is (cur_subq+1)%(the number of subqueues) and the counter_value is set to 0.0. Otherwise, it is cur_subq. Step.2 Perform the following procedure loop, starting from the start_id subqueue. The value of loop is set to loop+1. Step 2.1 If start_id subqueue is not empty, access packet at head of the line of this subqueue, get total size of this packet, and add it to counter_value. The next step depends on the values of loop and counter_value. Case 1 If counter_value is less than threshold_value, setsubqid to start_id and break loop. Case 2 If counter_value is not less than threshold_value and loop is less than number of subqueues, set counter_value to 0.0, setstart_id to (start_id +1)%(the number of subqueues), andgo back to Step 2. Case 3 If counter_value is not less than threshold_value and loop is not less than the number of subqueues, set counter_value to 0.0,set subqid to start_id, and break loop. Step 2.2 If start_id subqueue is empty, set start_id to (start_id +1)%(the number of subqueues), and go back to Step2. 3.3 REMARK The temporary variable loop was introduced in the above procedure because it is needed when a long packet has a total byte size larger than the threshold_value. Such a packet cannot be dequeued in the first round of round_robin scheduling, although it can be in the second round. (Step 2.1, Case 3). The classification is based on the remote port number of the TCP or UDP packet encapsulated in the IP packets. The port number of some applications is fixed, for example, for ftp it s 21, and for e-mail it s 25. If we want to classify applications without a fixed port number, we must modify the GnaT_App structure in the gna.h header file. I add the fixed port number for the applications without a defined port number. 3
A policy for classifying based on the packet format (IP, IPX, etc.) or protocol (TCP, UDP, etc.) can be easily introduced by modifying the classification function in arrival_state. 3.4 MODEL VERIFICATION To evaluate the performance of the CQ scheduling algorithm in the above process model, I used a simple model to check its behavior. I used the network model shown in Fig. 5. The source node (src_node_*) consists of a packet generator and a transmitter. The quasi-tcp packets have four fields, src port, dest_port, header, and data. The quasi-ip packets have two fields, header and data. In a packet generator module, the quasi-tcp packets are generated and encapsulated into the quasi-ip packets. Then, such quasi-ip packets are sent to the transmitter without transport protocol procedure. 128 Kbps Figure 5. Simple network model used for model verification. Each source node constantly generates 552 bytes IP packets at 10 PPS (packets per second). The src_port value is different for each application. The simple router model (CQ_Router) consists of five receivers, one transmitter and one queue module performing CQ scheduling. A subqueue with infinite capacity is set for each of four different applications. The proportion of the output link bandwidth (128 Kbps) assigned to the four applications (http, ftp, mail, and rlogin) is 50%, 10%, 20%, and 20% respectively. The destination node (DEST_node) receives packets and calculates the received bits per second (BPS) for each application. The total bits per second generated by the four source nodes is 176.64 Kbps, which is larger than 128 Kbps, so there is an overload situation. Fig. 6 shows that the received BPS of each application is a nearly value of the configured proportion of the bandwidth. The residual bandwidth of http, 20 Kbps, is shared among other applications according to the configured proportion. 3.5 VOICE SOURCE MODEL I used a following voice source model in the simulation. A voice application using a 32 Kbps coding scheme generated a 32 bytes packet every 8 ms during talkspurts. During silences, no packets were generated. I added a function representing this packet-generation process to the GNA model; it is similar to the nam_(application name)_spec_parse function in the net_app_mgr process model. The UDP protocol was used as the transport protocol. The data structure created by this function was used to invoke the child process of the net_app_mgr process, gna_cli process. I added a function of measuring the end-to-end delay of a voice packet, counting the number of received packets, and calculating the received bits per second in the net_app_serv process model. In voice over IP communications[4], sessions between sender and receiver are established using the H.323 or RTCP protocol. In this model, sessions were established in the net_app_mgr process. A voice source model using the G.728 or G.729 coding scheme is for further study. Received BPS (10 Kbps) Elapsed time (sec) Total : 128 Kbpss http: 44 Kbps (552*8*10) <64Kbps (128Kbps*0.5) ftp : 14Kbps Figure 6. Simulated BPS by application and total. 4. SIMULATION EXPERIMENT Because I developed the process model using OPNET TM version 4.0.A, two simulations were done using 4.0.A compatible models in version 5.1.D. Simulation 1 The network model is shown in Fig. 7. The ftp and mail source client and server models were the ordinary GNA client and server models. The traffic was configured as follows. The ftp client sent files by using only the put command. The average size of a file was 1024 bytes. The sending rates was 56,250 files/hour. The mail client sent and received the e-mails. The average message size was 1024 bytes. The rates of sending and of requesting were 56,250 messages or requests per hour. The capacity of the subqueues in the CQ (PQ) router was infinite, and the proportion of the output link bandwidth (128 Kbps) assigned to the three applications (ftp, mail others) was 70%, 20%, and 10%. The respective priorities in the PQ router were high, medium, and normal. 4
Received BPS (10 Kbps) As shown in Fig.8, the received BPS for ftp using the CQ router was higher than with the normal router. The received BPS of e-mail using the CQ router, 23 Kbps, is within the assigned bandwidth of 25.6 Kbps. With the PQ router, the e-mail wasn t dequeued from the subqueue after 15 sec. Only high-priority ftp traffic was transferred. 128 Kbps Figure 7. Network model Simulation 1 ftp(pq) email (Normal) ftp(cq) Elapsed time (sec) Figure 8. Received BPS Simulation 1 Simulation 2 The network model is shown in Fig.9. The custom application client and server models were the ordinary GNA client and server models. The voice client and receiver models were those described in Section 3.5. The traffic was configured as follows. The custom application client established a session once a minute and sent only one 5,000 bytes data during a session. The average talk-spurt was 350 ms long, and the average silence was 650 ms long. Only one voice source was established during simulation. The capacities of the subqueues for voice and custom application in the CQ router were 2400 bits and infinite respectively. A voice is the delay sensitive rather than loss sensitive. To prevent a packet in the buffer waiting a long time for service, in practice the output link buffer capacity for voice application is finite. The 90% of the 64 Kbps bandwidth in CQ router was assigned to the voice application and 10% to other applications. Fig. 10 shows the end-to-end delay for voice packets as a function of the elapsed time. Custom application data was transferred after 310 seconds of the elapsed time, which increased the delay. The end-to-end delay in the ftp(normal) email(cq) network model using a CQ router was shorter than with the normal router. 5. CONCLUSION REMARK I have described the modeling of OPNET TM IP routers with CQ and PQ scheduling algorithms for implementing differentiated services. Two simulations showed that these routers improved application performance. The developed models allow us to evaluate the performance of a network that uses routers with QoS functions. These models are also useful for configuring parameters for a QoS control scheme. REFERENCES [1] Cisco Systems, Cisco IOS TM Software Quality of Service (QoS) Solutions, http://www.cisco.com/ warp/public/cc/cisco/mkt/ios/tech/tch/qosio_wp.htm. [2] Geoff Huston, ISP Survival Guide, John & Wiley Sons, Inc., 1999. [3] Jeff Baher, (Translator Yutaka Sakurai), Design of Multimedia Networks, Nikkei BP Corp., 1996 (in Japanese). [4] Cisco Systems, Packet Voice Primer, Reference Guides, http://www.cisco.com/warp/public/cc/sol/ mkt/ent/gen/packv_in.htm. [5] OPNET User s Manual, MIL3 Inc., Washington D.C.. E to E delay (seconds) Figure 9. Network model Simulation 2 Normal 64 Kbps CQ Elapsed time Figure 10. End-to-end delay for voice packets Simulation 2 5