An Open-Source Platform for Distributed Linux Software Routers

Size: px
Start display at page:

Download "An Open-Source Platform for Distributed Linux Software Routers"

Transcription

1 1 An Open-Source Platform for Distributed Linux Software Routers Raffaele Bolla DITEN, University of Genoa, Italy Roberto Bruschi CNIT, Italy Abstract In this paper, our main objective is to explore how Linux Software Routers (SRs) can deploy advanced and flexible paradigms for supporting novel control-plane functionalities and applications. To this end, we investigate and study a new open-source software (SW) framework: the Distributed SW ROuter Project (DROP), which aims to develop and enable a novel cooperative middleware for distributed IP-router control and management. DROP allows logical network nodes to be built through the aggregation of multiple SRs based on the Linux operating system and commodity hardware, which can be devoted to packet forwarding or control operations. In addition to the original ForCES design, DROP aims to extend router distribution and aggregation concepts by moving them to a network-wide scale to enable and support value-added services for nextgeneration networks. Index Terms SW Router, distributed router architecture, open source software. INTRODUCTION The evolution of router and network architectures is one of the most relevant aspects of the Internet as we know it today, since it directly reflects the main open issues and the needs of current and upcoming network technologies and services. In a notable and sound proposal in this area, L. G. Roberts [1] showed that current network routers are too slow, costly, and power hungry. This effect is mainly caused by the architectural inefficiencies of the IPv4 protocol, which requires independent lookup operations to be performed for each incoming datagram. Starting from these considerations, Roberts proposed to evolve next-generation network protocols towards a more scalable forwarding paradigm that would be able to handle traffic at the flow level, rather than at the packet level. Van Jacobson et al. introduced the promising concept of content-centric networking [2]. Arguing that people value the Internet for the content it contains, the authors proposed to replace the location-based routing of IP with a new communications architecture built on named data, where a packet address specifies content, not location. This novel model was specifically designed to retain the simplicity and scalability of IP, while offering much better support for security, delivery efficiency, and disruption tolerance. Both of the above-mentioned contributions gave prominence to two critical needs for Future Internet device design and development: (i) the flexible, autonomous and network-integrated support of value-added heterogeneous services beyond the classical best-effort paradigm; (ii) the management of such heterogeneous operations, to be deployed inside network devices in a scalable way to provide high performance levels. The answer to these needs can be probably summarized in two simple keywords: advanced programmability and workload distribution. Regarding the programmability aspects, router and network device architectures based on general purpose hardware (HW) and open-source software (SW) have recently regained a remarkable consideration from the industrial and academic communities. Such renewed interest certainly stems from the intrinsic nature of newgeneration Software Router (SR) platforms, which provide complex mixtures of flexible SW-based commodities and efficient HW-based offload functionalities, with a boundary that continuously changes over time. HW capabilities can expose flexible header processing Application Program Interfaces (APIs), and software functionalities can rely on offload HW. An increasing number of network HW features are becoming programmable, and most commodity CPU designs are dedicating HW blocks to optimize special categories of instructions ranging from multimedia to encryption. At the same time, open-source operating systems, like Linux, offer evergrowing support for such HW enhancements and for complete network protocol stacks. In the common opinion, the primary concern with such architectures regards performance levels, which are thought to be much lower than those provided by specialized network platforms. However, in 2008 Intel [3] announced that general-purpose multi-core processors, when adopted in the data-path operations of networking devices, provide better performance levels than state-of-the-art network processors. Starting from these considerations, we want to take a step toward the evolution of open-source SRs, by moving the focus partially away from their data-plane performance analysis, to demonstrate the feasibility and viability of the SR approach with respect to commercial HW platforms. In this work, our main objective is to investigate how simple Linux SRs can scale beyond classical single-box architectures, providing advanced network capabilities and performance levels comparable to some medium-level commercial routers with multi-chassis architectures. To this end, we explore and study a new open-source SW framework: the Distributed SW ROuter Project (DROP) [4],

2 2 which aims to enable a novel distributed paradigm for control and management of flexible IP-router platforms based on commodity HW and the Linux operating system. In rough words, DROP allows building extensible and flexible multi-chassis routers on top of low-cost server HW platforms and the open-source Linux operating system. This multi-chassis router can be easily applied both as reference platform for advanced research experimentation [5], and as a low-cost alternative to medium-size commercial routers. In more detail, the DROP open-source nature allows researchers to modify any part of device functionalities in a quite easy and quick way. The adoption of commodity HW, which is in continuous and rapid evolution, assures acceptable levels of performance at very low costs. This paper is an extended version of previous contributions in conference proceedings [6] [7] [8], and it tries to give a more organic and complete overview of the DROP framework. Relevant state-of-the-art approaches are discussed and compared with DROP; the internal architecture and the embedded mechanisms are introduced in a deeper detail; working dynamics and performance evaluation results are described more thoroughly.. DROP is partially based on the IETF ForCES architecture [9] [10] [11], and it allows building logical network nodes through the aggregation of multiple devices (i.e., SRs) that host standard Linux SW objects devoted to packet forwarding (e.g., the Linux kernel, or the Click modular router, etc.) or to control operations (e.g., Quagga, Xorp, etc.). As suggested by the IETF ForCES directives, DROP provides an architectural solution that is able to orchestrate such objects. In more detail, it was specifically designed in order to (i) meet the features of Linux network system interfaces and applications, (ii) hide the complexity of distributed network nodes in an autonomic way, and (iii) offer a single control and management interface to system administrators. From a physical point of view, and as shown in Fig. 1, the DROP architecture is composed of three main building blocks: a number of devices (or elements ) running the SW for performing operations at the data- and/or controlplane; a number of interfaces toward external public networks, connecting DROP to other network devices; an internal private network that is used to exchange data-path traffic and inter-element signaling messages. DROP is specifically aimed at aggregating and coordinating all these building blocks in a single logical IP node that is able to directly manage all the public interfaces towards other external devices. The internal private network could be realized by means of different layer 2 and 2.5 technologies, such as Ethernet (currently supported), Infiniband, MPLS and OpenFlow (whose support is currently under progress). The basic idea consists of enabling a selected set of devices at the edge of this internal network cloud to coordinately work like a single logical node with a unique configuration and management interface, without exposing the presence of the internal network and the complexity of the distributed architecture. Figure 1. Overview of distributed router architecture. The current paper focuses on introducing the DROP architecture and working dynamics, validating its mechanisms through standard testing methodologies and demonstrating that it allows increasing SR performance in a scalable way. The paper is organized as follows. The next section describes some related work. The basic concepts concerning Linux SR architectures are summarized in Section 3, by focusing on the internal APIs used between control- and data-planes. Section 0 introduces the main design and functional issues in distributing IP router functionalities, and Section 0 provides an overview of DROP and its architecture. Section 0 shows some working dynamics of the proposed architecture and sketches how DROP s building blocks interact among themselves. Section 0 reports the results of tests that were conducted to analyze the performance and the architectural bottlenecks of the prototype. Conclusions are finally drawn in Section 0. RELATED WORK The idea of a distributed SR has already been investigated in recent works, such as [12], [13] and [14]. However, these papers tackled the router distribution issue in a different and somehow complementary way with respect to the DROP objectives. A software router architecture, called Routebricks, was proposed in [12], [13]. Routebricks was specifically designed to scale SR data-plane performance through the parallelization of router functionality, both across multiple servers and across multiple cores within a single server. By using four state-of-the-art general

3 3 purpose servers and the Click software [15], the authors demonstrated a 35 Gbps parallel router prototype; this router capacity can then be linearly scaled through the use of additional servers. With a similar purpose, Bianco et al. [14] [16] proposed a multistage architecture, exploiting PC-based routers as switching elements. The multistage architecture was designed to overcome the intrinsic performance limitations of a single-box SR. By combining simple layer-2 load-balancing capabilities at the front stage with an array of layer-3 routers at the back stage, the authors demonstrated that the proposed architecture has excellent scalability characteristics, which could, for example, enable the routing of minimum size packets at line rates in the order of the Gbit per second. In this respect, Sarrar et al. [17] [18] proposed a scheme to increase lookup scalability in data-plane elements. In more detail, they propose a two-stage lookup mechanism: the data-plane elements include a small and fast lookup table cache, filled with the most used and recent routing entries. If the incoming packets hit one of the cached entries, their forwarding happens entirely on the first-stage data-plane element. In case of cache miss, the packets are sent to a second-stage element, named controller, which contains the entire routing table of the device. With respect to the previous contributions, the current work does not directly focus on how to scale the dataplane performance of a multi-box SR, but rather on how to realize its dynamic and autonomic management. In fact, the main objectives of the DROP platform consist of offering support to the following tasks: the run-time composition of the aggregated router, by dynamically managing subscription and disassociation of control and forwarding elements; the flexible adoption of different internal network topologies and organizations: unlike the Routebricks and Bianco et al. proposal, DROP does not fix specific requirements for the distributed data-plane architecture and organization; the dynamic update of routing information and parameters (i.e., routing tables, network interface status, etc.); the management of multiple control elements for resilience purposes; the execution of common signaling and control functionalities (e.g., OSPF and BGP) and the management of the router slow-path. The DROP prototype also includes native support for Netlink communications with the Linux kernel and opensource routing suites, such as Quagga [19] and Xorp [20]. Some developments based on the IETF ForCES architecture have been proposed in the last years (see for example [21] [22] [23], among others). However, to the best of the authors knowledge, such developments do not fit natively on the Linux operating system and commodity HW. The OpenFlow project [24] [25] is a further activity relevant for the current work. Briefly, OpenFlow is a protocol for interfacing the data- and control-plane in a flexible and easily extensible way. Openflow is already supported in a large set of commercial network devices, and it allows the external management of their data-plane capabilities through external control-plane applications. In this respect, OpenFlow can be seen as an interesting alternative to the ForCES protocol. Future versions of the DROP software will include support of OpenFlow. In this respect, the RouteFlow project [26] [27] is relevant to DROP. Like our proposed framework RouteFlow aims at providing an abstraction interface for interconnecting the Quagga routing suite with elements specialized in data-plane operations. The main difference between the two frameworks certainly consists of both the typology of forwarding elements, and the protocol used to remotely manage them: RouteFlow is based on OpenFlowenabled switches, DROP on Linux software routers. This base difference certainly affects the internal organization of data and mechanisms, but above all it directly leads to the underlying and thin gap between open-source software-based network devices and software-defined networks. The latter allow using a pre-determined set of base functionality methods exposed by data-plane hardware in a flexible way. In the OpenFlow case these methods follow a flow-driven model. The extension of the set of methods may result complex, but the data-plane elements generally provide high performance, since interface methods are often directly translated into configurations of specialized hardware. On the contrary, software-based devices intrinsically offer the possibility of extending their functionality, generally at the cost of a lower performance level. Moreover, in open-source software-based devices like DROP, the networking functionalities can be easily extended, upgraded or modified by everybody without any additional costs. In addition, software-defined approaches may not be sufficient to effectively cover every advanced network functionality, like, e.g., traffic monitoring and deep packet inspection [28] [29]. Nevertheless, the same functionalities may be natively handled by software-based devices. The resulting situation suggests that the two approaches are complementary and that may be jointly applied to realize highly programmable future networks. Finally, other recent works, such as [30], [31], and [32], deal with the performance optimization of single-box SRs. These contributions showed that new-generation single-box SR platforms can exploit a Linux-based networking SW system and can correctly deploy a multi-cpu/core PC architecture. The achieved results demonstrate that Linux SW routers can attain remarkable levels of data-plane performance, while at the same time preserving portions of the PC s capacity for the application layer. MONOLITHIC LINUX SW ROUTERS Standard architectures for monolithic SRs have to provide a wide and heterogeneous set of functionalities and

4 4 capabilities. Thus, they can move these functionalities and capabilities from those that are directly involved in the packet forwarding and switching process to those that are needed for control (e.g., OSPF and BGP), dynamic configuration and monitoring. With focus on Linux-based architectures, as outlined in [33], [34] and [35], all of the forwarding functions are realized inside the Linux kernel, while the large part of the control and monitoring operations (e.g., routing and control protocols) are daemons/applications running in user mode. The most well-known examples of network applications/daemons are the Quagga [19] and the XORP [36] routing suites. Similar to their commercial relatives, SRs obviously provide two main kinds of internal traffic paths: the fast and the slow path. The fast path substantially consists of the L2 and L3 forwarding chains, and it is selected for all data packets that need only to be routed or switched and do not require processing at the service/control layer of the local router. In contrast, the slow path is used by all packets that are directed towards local service and control applications (e.g., OSPF Hellos and Link State Updates (LSUs), as well as BGP keep-alive messages have to be delivered to local IP control applications). Packets following the slow path are generally referred as exception packets. As pointed out in Fig. 2, the delivery of exception packets to control and service applications is performed through well-known standard interfaces between kernel and user spaces, namely, network sockets. In addition to packet-related APIs, a router architecture needs a further set of interfaces between data- and control-planes to exchange control data (e.g., for updating the Forwarding Information data Base (FIB) or for exchanging information about the status of a network interface). In this regard, Linux includes a highly advanced and complete API, called Netlink [37], which is used as an intra-kernel messaging system, as well as between kernel and user-space. Netlink includes all the L2 and L3 interfacing capabilities needed to synchronize the control and the data engines of an entire router. It also allows unicast, multicast and broadcast delivery of control data between data-plane components and applications. L2 Control Plane Control and Service Layer Router L3 quagga Control Plane(Quagga) Plane & Services Services & Applications User space Kernel space Packet Sockets Control Status API PHY L3 FIB L3 forwarding Chain L2 FIB L2 forwarding Chain Switching Matrix PHY Figure 2. Reference architecture of a Linux based monolithic SR. DESIGN CONCEPTS, CONSTRAINTS AND GUIDELINES FOR DISTRIBUTING ROUTER FUNCTIONALITIES The realization of a distributed router deals with the separation of data- and control-plane functionalities in multiple logical elements that can mutually cooperate to behave like common single-box devices. The control-plane processes and tasks of an IP router cannot be easily distributed among different HW platforms, because today s routing protocols are designed to work with a single aggregation point. For example, the software for realizing routing operations (e.g., OSPF and BGP) usually runs as applications/daemons on a single device. Starting from these considerations, the aggregation point has to run such control applications and maintain the entire Forwarding Information Database (FIB) of the distributed device, which includes all of the routing and policy tables and the list of network interfaces. As outlined in [9], the availability of multiple elements that are capable of performing control-plane functionalities can be easily exploited only for resilience purposes. For example, if multiple elements acting at the control- plane are present, only one can work actively, while the others can only be used as backup copies for fault recovery. The active control element has to manage multiple data-planes in a dynamic and coordinated way. For example, it has to maintain and update the aggregated FIB (i.e., the FIB of the logical node resulting from the aggregation of all the elements). This aggregated FIB corresponds to the one that an equivalent single-box router would have. Its filling obviously depends on the information and data coming from both the forwarding elements and the network control/signaling processes (e.g., Quagga). Thus, the information in the aggregated FIB has to be suitably disaggregated in a number of local routing databases, one for each element performing forwarding operations. In addition, a copy of the disaggregated FIB has to be installed at each forwarding element and synchronized with the central database. All of these design requirements lead to the

5 5 need for a distributed software that is able to bi-directionally transfer a router s control data 1 and to aggregate/disaggregate the data among multiple data-plane elements, including the active control-plane. In addition to FIB management, network control and signaling applications (e.g., routing daemons like Quagga), which run on control elements, need to receive and transmit protocol signaling packets from/to the network interfaces of data-plane elements. Common IP routing protocols require different typologies of signaling packets, which range from IP datagrams with unicast and/or multicast addresses (e.g., OSPF) to TCP connections toward the IP addresses of external network interfaces (e.g., BGP). Therefore, a distributed router architecture must also provide specific solutions for enabling data-plane elements to forward such signaling packets to the control element. In other words, a specific solution is needed for the management of the router s slow path. THE DROP ARCHITECTURE The proposed architecture consists of a cooperative and distributed software framework that controls and coordinates the data-plane and the control-plane functionalities among different SRs in a dynamic and autonomous way. SRs that specialize in data-plane functionalities are referred to as Forwarding Elements (FEs), and those acting at the control-plane are known as Control Element (CEs). In more detail, DROP aims to: 1) aggregate and coordinate the fast path of a set of Linux boxes; 2) use a single Linux box to act as a reference control element, called master CE (mce), and run control applications (e.g., Quagga and Xorp) and user-driven configuration and monitoring tools (e.g., tc and ip) 2 ; 3) realize a flexible paradigm for supporting the router s internal slow path; 4) provide multiple backup elements for the control functionalities, called backup CEs (bces), which may quickly become active upon the failure of the master control element. It is worth noting that the large part of base IP router functionalities provided by DROP can be realized also using Openflow-based data-plane hardware in place of Linux SRs (future DROP versions will include such capabilities). As discussed in section 0, the value-added contribution given by a SR solution consists in extending and supporting advanced functionalities (e.g., deep packet inspection) at the data plane in a more flexible and effective way. The DROP framework consists of two main applications: the Control Element Controllers and Forwarding Element Controllers (CEC and FEC, respectively), which run on elements that perform the control-plane and dataplane functionalities 3 and jointly cooperate to make these multiple entities behave as a single network element. CEC and FEC applications interact among themselves to realize the distributed management of the slow and fast paths in a transparent and autonomic way with respect to other network processes and applications. CECs are devoted to managing and exchanging information regarding the whole aggregated router with the control applications. These applications include Quagga and/or other Operation, Administration and Management (OAM) tools like command line interfaces. FECs are responsible for managing the forwarding configuration of each FE. Fig. 3 shows an overall overview of the proposed framework and outlines the different SW modules included in CEC and FEC applications and the interfaces between them, including those to control applications and to the Linux kernels of the FEs. The aggregation/disaggregation of such data is centrally guided by the active CEC, which includes most of the DROP logic mechanisms and algorithms. FECs are devoted to the application of commands and configurations from the CECs and to forwarding all routing exception data to the CECs. The rest of this section will describe in detail the main architecture and functionalities of the DROP framework. Sub-section 0 introduces the internal interfaces needed to exchange control data. The CEC and FEC applications are described in sub-sections 0 and 0, respectively. Sub-section 0 discusses how multiple backup CECs can be maintained for fast recovery purposes. Finally, sub-section 0 focuses on the Tx/Rx Slow Path Exception Packet Manager (xppm) module, which is part of the FEC element and is the key element for the management of slow path processes in the DROP framework. The Distributed Router Internal Interfaces As shown in Fig. 3, the DROP framework uses different interfaces to bi-directionally exchange control data and exception packets among CEC and FEC applications, the data-planes of FEs, and control-plane applications. In detail, we have three communication stages: (i) the communication between the FE kernel and the CEC application, (ii) the communication between the FEC and the CEC, and finally, (iii) the communication between the control processes and the CEC. As far as the communications towards the control-plane and the kernels are concerned, we decided to use standard Linux intercommunication interfaces to guarantee the maximum compatibility with the Linux system and its applications. Exception traffic is exchanged through standard network sockets. Con- 1 The same kind of information is carried by Netlink in a standard Linux-based architecture. 2 Control applications may indeed run on third-party elements, but a master control element is required to maintain the database of the distributed router elements and to synthesize the Forwarding Information Database (FIB). 3 If an SR performs both functionalities, separated instances of a FEC and a CEC are required for such element.

6 6 trol data are exchanged through the Linux native Netlink protocol 4 [37]. These control data include data regarding the configuration of the router such as, for example, a routing table or the parameters of a network interface. The control data communication between the CEC and the FEC is realized through a very simple protocol inspired to the ForCES one [38]. The forwarding of exception packets is performed with the packet encapsulation mechanism introduced in section 0, and the control data is carried by using the same templates and contents of Netlink messages. Figure 3. Software architecture of the DROP framework: CEC and FEC applications and their intercommunication interfaces. Forwarding Element Controller As previously mentioned, the FEC is a control application that acts mainly as a bi-directional SW bridge between the CEC and the forwarding element s data-plane. As shown in Fig. 3, the xppm module is part of the FEC applications, but because it performs a different kind of application with respect to the other FEC threads, we decided to introduce it in a separate sub-section. In detail, the main objectives of the FEC consist of the direct management of the SR functionalities and control data of the FE and of the synchronization of the local network information with the CEC. For this purpose, each FEC uses two communication interfaces: the FEC-to-kernel and FEC-to-CEC interfaces. The former allows the FEC to write/modify network parameters in the SR kernel. Such parameters include configurations of network interfaces and routes. The FEC-to-kernel interface also reads notification events from kernel (e.g., link and failures). The latter is used for two-way communications between FEC and CEC. The FEC is composed of multiple SW threads, each one with a specific role. The core thread, called the FE Manager, maintains a copy of the FE disaggregated FIB and provides the mechanisms for performing three important functions. These functions are: (i) communication management with the master CEC, (ii) control data and event retrieving from the kernel, and (iii) processing of commands/notifications from the CEC or from the kernel. As far as the last functionality is concerned, the FE manager can query or write configuration data through the Netlink client thread, which simply translates the operation syntax from the CE manager to the Netlink manager and sends the message to the SR kernel. Kernel answers and notifications are received by the Netlink server thread. Upon receiving the message, the Netlink server parses the kernel message and forwards it to the FE manager, and if needed, it notifies the CEC. The FEC-to-CEC connectivity is locally provided by four threads that realize two separate layers of communication. The former provides an authentication mechanism for the message exchange (Transmission and Reception Authentication Tx and Rx Aut) to create a secure channel between the CEC and the FEC. The latter, TCP Tx and Rx, implements basic operations for TCP connection management. Another FE thread is the Bootstrap Client, which implements the initial CEC discovery mechanism and all the needed routines for connecting with the CEC. Control Element Controller The principal aim of the CEC is threefold: (i) maintain the connectivity to FEs and backup CEs; (ii) provide an 4 The communication between FEC and kernel is realized over a standard Netlink socket, and the one between the CEC and control application by using Netlink packets over a loopback UDP socket.

7 7 interface layer to control-plane applications and expose the aggregated FIB to them; (iii) elaborate all of the data coming from the control-plane applications and the FEs to manage the aggregated FIB and the disaggregated copies for FEs. Similarly to the FEC, the CEC application consists of a set of SW threads, as shown in Fig. 3. Here, the core thread is the CE Manager, which includes all of the mechanisms and algorithms needed to dynamically coordinate the FEs and control-plane applications. In fact, all the other CEC threads are solely devoted to maintaining or initializing communications to other FECs, CECs or local control applications/services. Specifically, the bidirectional communication with control and service applications (e.g., Quagga) is provided by two threads, namely, the Netlink client and server, which are devoted to sending and receiving control data, respectively. The connectivity toward other FECs and CECs is managed through a stack of threads: Tx and Rx Aut, which are devoted to authentication operation of outgoing and incoming messages; Two Tx and Rx TCPs for each connected FE and CE, which are then used to the manage transmission and reception operations of TCP sockets. In addition to the previously cited threads, a Bootstrap Server is used to listen to connection requests originated by FECs in the initialization phase. When it receives a valid join request, the CEC sends the FEC the proper configuration parameters such as the IP address and TCP connection port. Coming back to the CE Manager, it realizes a complex state-machine that manages any FIB and configuration modifications. For example, it updates the routing table and adds a network interface or a new FE in a secure and reliable way. This state machine can be triggered by messages and notifications coming from the local controlplane applications by means of the Netlink Server thread, or from other elements by means of the TCP Rx and Rx Aut threads. Messages from other elements or Netlink messages from control-plane applications contain one or more requested operations. Operations can be classified into two main typologies: set and get. A get operation is a simple request for reading one or more parameters of the FIB; consequently, the CE manager replies to the sender with a message containing the required information. A set operation is much more critical, since it requires modifications to the FIB and, potentially, to the configurations of all FEs, so that a reliable mechanism for handling unexpected states is clearly needed. Therefore, we started our design by considering the Netlink features. In fact, this protocol already provides a robust set of atomic commands that are used by applications to request changes to the local kernel, and notify the results back. Linux applications performing complex operations usually include the necessary logic for translating them into atomic Netlink commands, and for managing their (intermediate) results. However, in the DROP context, each atomic Netlink command or event (e.g., disconnection of a link) ends up in a number of updates for synchronizing remote FE FIBs. The return status of a command can be positive only upon the acknowledgment of all the involved FEs. Otherwise, the command needs to be aborted and the pending changes removed by FIB copies on CEs and FEs. Specifically, as shown in Fig. 4, when the manager receives a set operation, it usually has to: process and update its internal databases according to the received message, and mark the modification as pending ; forward the requested modification to the local control-plane and/or to the elements (as needed); await for an acknowledgement that the request modification has been applied/received by control applications or elements; when all the expected acknowledges arrive, a confirmation to the FEs is sent, and the status of modifications in the CE database is passed from pending to confirmed ; a notification of the FIB change is sent to control-plane applications. It is worth noting that the CE manager has to communicate with elements and with local control applications not only by means of two different protocols (Netlink and ForCES) but also with different data structures. In fact, communications towards the control-plane are performed with the aggregated FIB and those towards other elements with the disaggregated version. The aggregated FIB obviously contains a part of the information that is maintained by the disaggregated FIBs, because the latter includes additional parameters of the internal configuration of the distributed router. Tables I, II and III are an example of the three tables that compose the entire aggregated FIB, namely, the list of connected elements, the table of network interfaces, and the routing table.

8 8 New configuration change request (e.g., triggered by control plane applications, link fault events on a FE, fault of a FE) Stop Recalculation of the new «aggregated» and «disaggrated» FIBs Generate a Nack for Control Plane Application Generate a notification to all the Control Plane Applications Send FIB change requests to involved FEs Pending State Remove pending modifications from local new «aggregated» and «disaggrated» FIBs Confirm the changes to the local «aggregated» and «disaggrated» FIBs No Received the notifications from all the FEs? Timer has expired Remove pending modifications from FEs Confirm modifications to all FEs Yes The notifications are all successful? No Yes Figure 4. Base state machine implemented by the CE Manager for set operations. In detail, Table I shows that the CE Manager maintains a specific entry for each connected element, which includes a univocal identifier, the type of the element (CE or FE), its list of private IP address (i.e., the addresses of interfaces that compose the private network), and a list of available routing capabilities. As shown in the example data in Table I, DROP allows elements to have multiple internal interfaces and to build any internal topology to interconnect elements. The data in Table I describe a star topology (the network /24 where all the elements are connected) with the addition of two full-duplex links between FE 0 and FE 1 and FE 1 and FE 3. The traffic routing inside such a topology is managed by the CE manager during the FIB aggregation and disaggregation operations. Table II maintains all of the data related to the public interfaces of the distributed router. For each network interface, this table includes all the parameters that are usually accessible through the ifconfig command in a standard Linux box (e.g., L2 protocol, link speed, MAC address, IP address, IP netmask, status of the link, and Tx and Rx traffic counters). Moreover, because the interfaces of different FEs may have overlapping names and indexes (as shown in Table II), this table has also to provide a suitable and univocal remapping of such parameters. Table III reports the disaggregated routing table, which corresponds to the sum of the local routing tables of all the FEs. The corresponding aggregated version is shown in Table IV, and it is substantially the classical routing table of an equivalent single box router. The content of this last table is computed by routing daemons (i.e., Quagga) and sent to the CEC through the Netlink interface. TABLE I: LIST OF CONNECTED ELEMENTS MAINTAINED BY THE CE MANAGER. Element ID Type Private addresses Capabilities 0 FE /24 IP forwarding, IP QoS, Ethernet /24 forwarding, etc. 1 FE /24 IP forwarding, IP QoS, Ethernet /24 forwarding, etc /24 2 CE /24 IP control-plane, backup 3 FE /24 IP forwarding, IP QoS, Ethernet /24 forwarding, etc. TABLE II: TABLE OF THE PUBLIC NETWORK INTERFACES MAINTAINED BY THE DISAGGREGATED FIB OF THE CE MANAGER. Aggregated Interface name Local Interface name FE Id Aggregated Interface index Local Interface index eth0:0 eth /24 Eth 100 eth0:1 eth /24 Eth 1000 eth1:1 eth /24 Eth 1000 IP address Type Speed

9 9 TABLE III: DISAGGREGATED FIB MAINTAINED BY THE CE MANAGER. # FE# Network Next hop Interface Origin / eth0:0 Static / Static / Static / Quagga / eth0:1 Quagga / Quagga / Static / eth1:1 Static / Static As the CE manager receives an aggregated routing entry, it derives an entry for each FE. As shown in Table III, the FE that owns the egress interfaces has the same entry as the aggregated routing table, and the routing lines of the other FEs have an internal delivery. The delivery is calculated by the CE manager based on the internal topology and to guarantee shortest-paths and load balancing. Once derived, these entries are sent to FEs through the mechanism introduced in section 0. The aggregated FIB (i.e., the control data that is exchanged with the applications working at the control plane) is composed only by the routing table in Table IV and the network interface list in Table V, which includes only a sub-part of the parameters in Table II. In fact, in the aggregated version of the network interface list, all parameters that are referred to the internal organization of the distributed router (e.g., the ID of the FE owning the interface) are omitted. TABLE IV: AGGREGATED FIB MAINTAINED BY THE CE MANAGER. # Network Next hop Interface Origin / eth0:0 Static / eth0:1 Quagga / eth1:1 Static TABLE V: TABLE OF THE PUBLIC NETWORK INTERFACES MAINTAINED BY THE AGGREGATED FIB OF THE CE MANAGER. Aggregated Interface name Aggregated Interface index IP address Type Speed eth0: /24 Eth 100 eth0: /24 Eth 1000 eth1: /24 Eth 1000 Backup CE The distributed router has a centralized architecture because it supports only one active CE at time. The presence of a single CE is obviously critical because it is a single point of failure. To avoid this drawback, DROP supports the presence of multiple backup CEs, which, in case of failure of the master, can replace it in an automatic and transparent manner without modifying any internal configurations. In usual operating conditions, all of the backup CEs (bces) maintain an active TCP connection to the mce, which is used for synchronizing FIB data and for exchanging heart-beating messages. In detail, every time the master CE updates its databases, it forwards the updated data to the bces with ForCES messages. In this way, every bce has a full and updated copy of all the master CE s databases. In contrast, heartbeat messages are used to monitor potential failures in the mce. Every CE has a univocal identifier number that is set during initial bootstrap operations. The identifier is used to manage CE priorities in case of failure. The mce takes on the maximum value for this identifier, while the best candidate bce to replace the mce takes on the second maximum value. However, upon mce failure, a re-election phase among CEs is performed to assure the presence of other bces as well. It is worth noting that, upon mce failure, and meanwhile a new master CE is elected, all the FEs maintain their configuration for a certain time (as specified in the RFC 3623 non-stop forwarding mechanism [39]). If the new mce is elected before this time (the usual case), the forwarding process has no interruption. If no new mce is elected before such time period expires, the FEs reset their configurations. Tx/Rx Slow Path Exception Packet Manager (xppm) As sketched in section 0, one of the major issues in distributing routing functionalities is related to the management of the slow path. Most of the protocols signaling is carried by heterogeneous packets, which are often destined to IP addresses of router interfaces or to IP multicast addresses (e.g., OSPF). Moreover, the signaling data are encapsulated with highly heterogeneous protocol stacks. For instance, OSPF uses signaling packets directly encapsulated in IP datagrams. Specifically, on broadcast networks, the OSPF protocol uses two multicast addresses

10 10 ( to send a Hello Packet and to send information data). BGP transfers signaling packets through TCP connections on port 179. When Internal-BGP is adopted, the TCP connection usually ends the IP loopback address; however, when external-bgp is adopted, TCP connections end at the IP addresses of external interfaces. In a monolithic router, both external-bgp and internal-bgp connections will be directly tied to the BGP routing software. In our case, if the loopback address is bound to the CE, only internal BGP connections may be directly received. In contrast, e-bgp connections have to be proxied by the FE towards routing daemons at the CE. Starting from all these considerations, we decided to introduce a specific process in the FEC architecture: xppm, which is devoted to the following functions: (i) intercepting signaling packets to be sent to the CE, (ii) forwarding such packets to control-plane applications that run on the CE by also including some useful local data (e.g., the identifier of the network interface where the packet was received), and (iii) intercepting possible replies from control-plane applications and forwarding them towards the correct external interface. As shown in Fig. 5, the xppm includes two main building blocks: the Connection-Oriented xppm (CO-xPPM) and Connection-Less xppm (CL-xPPM), which are oriented to manage exception traffic carried on TCP connections and connection-less traffic, respectively. In detail, the CL-xPPM is specifically designed for managing every exception packet that is not carried through TCP connections but with other heterogeneous encapsulations, such as UDP, IP, and Ethernet. Signaling traffic carried by TCP is managed in a separate way only because standard TCP sockets need quite different software dynamics with respect to the connection-less sockets (e.g., raw sockets ). CL-xPPM The CL-xPPM is composed of a set of thread pairs. Each pair is specifically devoted to managing the Tx and Rx operations of a certain typology of signaling packets (e.g., UDP packets on a certain Rx port and OSPF packets). The thread pairs are allocated by the FEC manager upon an explicit request from the CEC, which must provide all of the required information of the packet template to manage. Depending on the specific packet template, a different typology of socket (e.g., L2 and L3 raw sockets and UDP sockets) is used to receive and transmit packets to/from the external and the internal networks. The pair is composed of a first thread that receives packets from external networks and retransmits them to the CE and a second thread that manages the signaling traffic in the opposite direction. The packet delivery from the CL-xPPM to the CE is performed by encapsulating the data and all of the needed headers over new IP headers. The new IP headers have the CE address as the destination IP, and add further information related to the Rx network interface. In a similar way, signaling packets from the CE are encapsulated over IP headers, and directed to the FE. Signaling packets also include information on the Tx link interface to be used. It is worth noting that the simple signaling packet delivery between the xppm and the CE is not sufficient, since some control and service applications may need to exchange additional information/commands from the external interface. For example, in applications using raw sockets (e.g., OSPFd in Quagga), it is often necessary to preserve the identifier of the external interface that originally received the packet. Thus, the xppm was designed to enclose all such parameters in the traffic forwarded to the CE. Moreover, because well-known signaling protocols such as OSPF dynamically use different L2/L3 multicast addresses, the CE applications can directly request through a simple IP packet to the CL-xPPM pair to enable the reception of a certain multicast address. CO-xPPM The CO-xPPM is devoted to managing signaling traffic carried on TCP connections. It is composed of a master thread that acts as a controller and a variable number of thread pairs, one for each managed TCP connection. The aim of each thread pair is to proxy a TCP connection, coming from an external device and ending at the FE towards a new connection from the FE to the CE. The first thread on each pair manages the I/O operations of the TCP sockets towards the external network, and the second thread manages the same operations for the internal TCP socket. The two threads have two shared buffers that are used to exchange data from the external to the internal connection and vice versa. When a TCP connection is closed by the CE or by an external device, the thread receiving the connection closure signals it to its twin, and both connections will be closed. The controller thread has two main objectives. 1) to periodically check the status and the correct work of each thread pair; 2) to automatically allocate a new thread pair if a new TCP connection is initialized from the external or from the internal network. Obviously, the FEC manager has to notify a connection template (i.e., the values of the Tx port of the TCP) that the controller thread must manage.

11 11 Figure 5. xppm architecture and main building blocks. DISTRIBUTED ROUTER DYNAMICS This section introduces some examples of DROP working dynamics. In detail, sub-section 0 shows the main operations and SW dynamics when a routing application (i.e., Quagga) updates the routing table. Sub-section 0 introduces how a link fault on a FE is managed by the distributed router. Finally, sub-section 0 shows the main operations performed by the master CE and FEs for the Tx and Rx of exception packets. Updating the routing tables When a control application updates one or more entries of the routing table, it interacts with the CEC through the Netlink interface, as it does in standard Linux boxes. Routing table updates are intercepted by the Netlink Server, which subsequently reads the message payload and decodes it. The information is forwarded to the CE Manager, which updates the aggregated FIB with the new data, marking it as pending. The CE manager produces the new disaggregated FIB and sends the new data to the FEs by following a twostep mechanism. First, it communicates the new route entry through the ForCES interface to the FE that owns the route egress interface. For example, with reference to entry #1 of Table IV, the CE Manager sends the update message to FE #0 because it owns the network interface eth0:0. The disaggregated route sent to FE #0 is represented in the first line of Table III. On the other hand, the FEC receives the update message through Authentication Rx, which checks the network with TCP Rx. The FE Manager then processes the request and adds a routing path. This action is realized by writing a Netlink message, which is dispatched towards a Linux kernel across the Netlink Client thread. When the FE that owns the egress interface acknowledges the successful update of its local FIB, the CE manager signals to all the other FEs that the new route is available through distributed router internal delivery. To this end, the CE manager sends a different local route update to each FE, which contains the disaggregated FE-specific versions of the routing entry. However, with reference to the entry #1 of Table III, the second line is the local entry to be sent to FE #1 and the third line for the local entry to be sent to FE #3. Upon the successful completion, the FEs sends acknowledgement messages to the CEC. When all the acknowledgements are received, the CE manager changes the routing modifications from the pending state to the stable one, and forwards FIB synchronization messages to bces. Finally, the CE manager sends a further acknowledgement message to the control-plane applications through the Netlink interfaces. Similar procedures are also executed for deleting entries in the routing table. Link Fault A link fault event can be caused by the failure of link media, of the local network interface, or of the neighboring node. The management of such an event involves both the FEs and the mce, because it must be centrally managed by the control application of the CE, although it is an advertisement that occurs on a FE. Specifically, such messages are received from the kernel by the Netlink Server threads of the FEC. The FE Manager creates a notification message for the mce to be sent through the ForCES interface. When the message is received and parsed by the

12 12 CEC, the CE Manager: (i) updates the status of the network interface in the FIB; (ii) invalidates all the routes towards such interfaces, and sends update messages to FEs according to the mechanism shown in section 0; (iii) sends a notification to control-plane applications through the Netlink interfaces. Using the last notification, routing protocols can immediately advertise the topological change due to the link fault, propagate the link fault to neighboring nodes, and populate the new routing table. I/O operations for Exception Packets The complete realization of the slow path obviously deals with both the reception and transmission operations of exception packets. As already introduced in section 0, the key element is the xppm module, which maintains a number of network sockets listening for exception traffic both in the internal and external interfaces. The type and the number of sockets are signaled by the mce on the basis of the active control applications. As far as the reception phase is concerned, it usually works according to the steps indicated in Fig. 6, which can be summarized as follows: 1) the exception packet is received by the FE kernel on a public network interface and delivered to the xppm through the network socket; 2) when the xppm receives the exception packet, the xppm encapsulates it and some additional parameters in a new packet (the additional parameters include the aggregated index i.e., the univocal index in all of the distributed routers, see table II of the network interface that received the packet); 3) the new encapsulated packet is sent to the mce through a TCP connection; 4) when the mce receives the packet, it de-encapsulates the original exception packet and forwards it along with the aggregated interface index to the routing daemon through the CE-internal loopback network interface. The transmission phase, which is shown in Fig. 7, consists of the same operations as the reception phase, but applied in the opposite order. Figure 6. Reception operations of exception packets. Figure 7. Transmission operations of exception packets. PERFORMANCE AND ARCHITECTURAL EVALUATION Our primary objective in this section is to evaluate the performance levels and the main advantages that the DROP platform can offer. To this end, we decided to evaluate the DROP architecture through four main analysis and validation aspects with regards to the efficiency of its internal operations, the scalability of the overall distributed router architecture, as well as the data and the control-plane performance. Notwithstanding DROP s support of different topologies and layer 2 protocols in the internal private network, we decided to perform all of the tests with a star-switched Gigabit Ethernet LAN. This simple architecture allows us to obtain results that permit the evaluation of the DROP performance in a clearer and more intuitive way than the one that would be possible with a more complex network architecture. The distributed router is composed of a variable number of elements. Each FE and CE is a Linux SR with a HW platform based on two 3.0 GHz Intel Xeon Quad-Core processors and equipped with up to 8 Gigabit Ethernet network interfaces. Concerning testing and benchmarking tools, we used the Ixia N2X Router Tester, which allows us generate and measure traffic flows accurately and emulate routing protocols. All the tests have been carried out the number of times necessary to achieve a confidence interval of 3% and a confidence level of 95% on measured results.

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin, Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin, ydlin@cs.nctu.edu.tw Chapter 1: Introduction 1. How does Internet scale to billions of hosts? (Describe what structure

More information

Da t e: August 2 0 th a t 9: :00 SOLUTIONS

Da t e: August 2 0 th a t 9: :00 SOLUTIONS Interne t working, Examina tion 2G1 3 0 5 Da t e: August 2 0 th 2 0 0 3 a t 9: 0 0 1 3:00 SOLUTIONS 1. General (5p) a) Place each of the following protocols in the correct TCP/IP layer (Application, Transport,

More information

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF MODULE 05 MULTIPROTOCOL LABEL SWITCHING (MPLS) AND LABEL DISTRIBUTION PROTOCOL (LDP) 1 by Xantaro IP Routing In IP networks, each router makes an independent

More information

Chapter 09 Network Protocols

Chapter 09 Network Protocols Chapter 09 Network Protocols Copyright 2011, Dr. Dharma P. Agrawal and Dr. Qing-An Zeng. All rights reserved. 1 Outline Protocol: Set of defined rules to allow communication between entities Open Systems

More information

Chapter 12 Network Protocols

Chapter 12 Network Protocols Chapter 12 Network Protocols 1 Outline Protocol: Set of defined rules to allow communication between entities Open Systems Interconnection (OSI) Transmission Control Protocol/Internetworking Protocol (TCP/IP)

More information

Configuring MPLS L2VPN

Configuring MPLS L2VPN Contents Configuring MPLS L2VPN 1 MPLS L2VPN overview 1 Basic concepts of MPLS L2VPN 2 Implementation of MPLS L2VPN 2 MPLS L2VPN configuration task list 4 Configuring MPLS L2VPN 5 Configuring CCC MPLS

More information

Configuring OpenFlow 1

Configuring OpenFlow 1 Contents Configuring OpenFlow 1 Overview 1 OpenFlow switch 1 OpenFlow port 1 OpenFlow instance 2 OpenFlow flow table 3 Group table 5 Meter table 5 OpenFlow channel 6 Protocols and standards 7 Configuration

More information

Tag Switching. Background. Tag-Switching Architecture. Forwarding Component CHAPTER

Tag Switching. Background. Tag-Switching Architecture. Forwarding Component CHAPTER CHAPTER 23 Tag Switching Background Rapid changes in the type (and quantity) of traffic handled by the Internet and the explosion in the number of Internet users is putting an unprecedented strain on the

More information

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ Networking for Data Acquisition Systems Fabrice Le Goff - 14/02/2018 - ISOTDAQ Outline Generalities The OSI Model Ethernet and Local Area Networks IP and Routing TCP, UDP and Transport Efficiency Networking

More information

WCCPv2 and WCCP Enhancements

WCCPv2 and WCCP Enhancements WCCPv2 and WCCP Enhancements Release 12.0(11)S June 20, 2000 This feature module describes the Web Cache Communication Protocol (WCCP) Enhancements feature and includes information on the benefits of the

More information

PacketShader: A GPU-Accelerated Software Router

PacketShader: A GPU-Accelerated Software Router PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,

More information

VXLAN Overview: Cisco Nexus 9000 Series Switches

VXLAN Overview: Cisco Nexus 9000 Series Switches White Paper VXLAN Overview: Cisco Nexus 9000 Series Switches What You Will Learn Traditional network segmentation has been provided by VLANs that are standardized under the IEEE 802.1Q group. VLANs provide

More information

Internetworking/Internetteknik, Examination 2G1305 Date: August 18 th 2004 at 9:00 13:00 SOLUTIONS

Internetworking/Internetteknik, Examination 2G1305 Date: August 18 th 2004 at 9:00 13:00 SOLUTIONS Internetworking/Internetteknik, Examination 2G1305 Date: August 18 th 2004 at 9:00 13:00 SOLUTIONS 1. General (5p) a) The so-called hourglass model (sometimes referred to as a wine-glass ) has been used

More information

Computer Network Architectures and Multimedia. Guy Leduc. Chapter 2 MPLS networks. Chapter 2: MPLS

Computer Network Architectures and Multimedia. Guy Leduc. Chapter 2 MPLS networks. Chapter 2: MPLS Computer Network Architectures and Multimedia Guy Leduc Chapter 2 MPLS networks Chapter based on Section 5.5 of Computer Networking: A Top Down Approach, 6 th edition. Jim Kurose, Keith Ross Addison-Wesley,

More information

Data Center Configuration. 1. Configuring VXLAN

Data Center Configuration. 1. Configuring VXLAN Data Center Configuration 1. 1 1.1 Overview Virtual Extensible Local Area Network (VXLAN) is a virtual Ethernet based on the physical IP (overlay) network. It is a technology that encapsulates layer 2

More information

HP Routing Switch Series

HP Routing Switch Series HP 12500 Routing Switch Series EVI Configuration Guide Part number: 5998-3419 Software version: 12500-CMW710-R7128 Document version: 6W710-20121130 Legal and notice information Copyright 2012 Hewlett-Packard

More information

Table of Contents 1 MSDP Configuration 1-1

Table of Contents 1 MSDP Configuration 1-1 Table of Contents 1 MSDP Configuration 1-1 MSDP Overview 1-1 Introduction to MSDP 1-1 How MSDP Works 1-2 Protocols and Standards 1-7 MSDP Configuration Task List 1-7 Configuring Basic Functions of MSDP

More information

Multicast Technology White Paper

Multicast Technology White Paper Multicast Technology White Paper Keywords: Multicast, IGMP, IGMP Snooping, PIM, MBGP, MSDP, and SSM Mapping Abstract: The multicast technology implements high-efficiency point-to-multipoint data transmission

More information

NetWare Link-Services Protocol

NetWare Link-Services Protocol 44 CHAPTER Chapter Goals Describe the Network Link-Service Protocol. Describe routing with NLSP. Describe the data packet used by NLSP. Background The (NLSP) is a link-state routing protocol from Novell

More information

Configuring MSDP. MSDP overview. How MSDP works. MSDP peers

Configuring MSDP. MSDP overview. How MSDP works. MSDP peers Contents Configuring MSDP 1 MSDP overview 1 How MSDP works 1 MSDP support for VPNs 6 Protocols and standards 6 MSDP configuration task list 6 Configuring basic MSDP functions 7 Configuration prerequisites

More information

The Interconnection Structure of. The Internet. EECC694 - Shaaban

The Interconnection Structure of. The Internet. EECC694 - Shaaban The Internet Evolved from the ARPANET (the Advanced Research Projects Agency Network), a project funded by The U.S. Department of Defense (DOD) in 1969. ARPANET's purpose was to provide the U.S. Defense

More information

IP - The Internet Protocol. Based on the slides of Dr. Jorg Liebeherr, University of Virginia

IP - The Internet Protocol. Based on the slides of Dr. Jorg Liebeherr, University of Virginia IP - The Internet Protocol Based on the slides of Dr. Jorg Liebeherr, University of Virginia Orientation IP (Internet Protocol) is a Network Layer Protocol. IP: The waist of the hourglass IP is the waist

More information

Networking: Network layer

Networking: Network layer control Networking: Network layer Comp Sci 3600 Security Outline control 1 2 control 3 4 5 Network layer control Outline control 1 2 control 3 4 5 Network layer purpose: control Role of the network layer

More information

Overview. Information About Layer 3 Unicast Routing. Send document comments to CHAPTER

Overview. Information About Layer 3 Unicast Routing. Send document comments to CHAPTER CHAPTER 1 This chapter introduces the basic concepts for Layer 3 unicast routing protocols in Cisco NX-OS. This chapter includes the following sections: Information About Layer 3 Unicast Routing, page

More information

IP Routing Volume Organization

IP Routing Volume Organization IP Routing Volume Organization Manual Version 20091105-C-1.03 Product Version Release 6300 series Organization The IP Routing Volume is organized as follows: Features IP Routing Overview Static Routing

More information

Planning for Information Network

Planning for Information Network Planning for Information Network Lecture 7: Introduction to IPv6 Assistant Teacher Samraa Adnan Al-Asadi 1 IPv6 Features The ability to scale networks for future demands requires a limitless supply of

More information

Table of Contents 1 MSDP Configuration 1-1

Table of Contents 1 MSDP Configuration 1-1 Table of Contents 1 MSDP Configuration 1-1 MSDP Overview 1-1 Introduction to MSDP 1-1 How MSDP Works 1-2 Multi-Instance MSDP 1-7 Protocols and Standards 1-7 MSDP Configuration Task List 1-7 Configuring

More information

Contents. Configuring EVI 1

Contents. Configuring EVI 1 Contents Configuring EVI 1 Overview 1 Layer 2 connectivity extension issues 1 Network topologies 2 Terminology 3 Working mechanism 4 Placement of Layer 3 gateways 6 ARP flood suppression 7 Selective flood

More information

Contents. Configuring MSDP 1

Contents. Configuring MSDP 1 Contents Configuring MSDP 1 Overview 1 How MSDP works 1 MSDP support for VPNs 6 Protocols and standards 6 MSDP configuration task list 7 Configuring basic MSDP features 7 Configuration prerequisites 7

More information

MPLS MULTI PROTOCOL LABEL SWITCHING OVERVIEW OF MPLS, A TECHNOLOGY THAT COMBINES LAYER 3 ROUTING WITH LAYER 2 SWITCHING FOR OPTIMIZED NETWORK USAGE

MPLS MULTI PROTOCOL LABEL SWITCHING OVERVIEW OF MPLS, A TECHNOLOGY THAT COMBINES LAYER 3 ROUTING WITH LAYER 2 SWITCHING FOR OPTIMIZED NETWORK USAGE MPLS Multiprotocol MPLS Label Switching MULTI PROTOCOL LABEL SWITCHING OVERVIEW OF MPLS, A TECHNOLOGY THAT COMBINES LAYER 3 ROUTING WITH LAYER 2 SWITCHING FOR OPTIMIZED NETWORK USAGE Peter R. Egli 1/21

More information

Internet Routing Protocols Part II

Internet Routing Protocols Part II Indian Institute of Technology Kharagpur Internet Routing Protocols Part II Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T. Kharagpur, INDIA Lecture 8: Internet routing protocols Part

More information

CSC 401 Data and Computer Communications Networks

CSC 401 Data and Computer Communications Networks CSC 401 Data and Computer Communications Networks Link Layer, Switches, VLANS, MPLS, Data Centers Sec 6.4 to 6.7 Prof. Lina Battestilli Fall 2017 Chapter 6 Outline Link layer and LANs: 6.1 introduction,

More information

Chapter 5.6 Network and Multiplayer

Chapter 5.6 Network and Multiplayer Chapter 5.6 Network and Multiplayer Multiplayer Modes: Event Timing Turn-Based Easy to implement Any connection type Real-Time Difficult to implement Latency sensitive 2 Multiplayer Modes: Shared I/O Input

More information

MPLS VPN--Inter-AS Option AB

MPLS VPN--Inter-AS Option AB The feature combines the best functionality of an Inter-AS Option (10) A and Inter-AS Option (10) B network to allow a Multiprotocol Label Switching (MPLS) Virtual Private Network (VPN) service provider

More information

HP Routing Switch Series

HP Routing Switch Series HP 12500 Routing Switch Series MPLS Configuration Guide Part number: 5998-3414 Software version: 12500-CMW710-R7128 Document version: 6W710-20121130 Legal and notice information Copyright 2012 Hewlett-Packard

More information

Experimental Extensions to RSVP Remote Client and One-Pass Signalling

Experimental Extensions to RSVP Remote Client and One-Pass Signalling 1 Experimental Extensions to RSVP Remote Client and One-Pass Signalling Industrial Process and System Communications, Darmstadt University of Technology Merckstr. 25 D-64283 Darmstadt Germany Martin.Karsten@KOM.tu-darmstadt.de

More information

Financial Services Design for High Availability

Financial Services Design for High Availability Financial Services Design for High Availability Version History Version Number Date Notes 1 March 28, 2003 This document was created. This document describes the best practice for building a multicast

More information

Software-Defined Networking (SDN) Overview

Software-Defined Networking (SDN) Overview Reti di Telecomunicazione a.y. 2015-2016 Software-Defined Networking (SDN) Overview Ing. Luca Davoli Ph.D. Student Network Security (NetSec) Laboratory davoli@ce.unipr.it Luca Davoli davoli@ce.unipr.it

More information

Configuring MSDP. Overview. How MSDP operates. MSDP peers

Configuring MSDP. Overview. How MSDP operates. MSDP peers Contents Configuring MSDP 1 Overview 1 How MSDP operates 1 MSDP support for VPNs 6 Protocols and standards 6 MSDP configuration task list 7 Configuring basic MSDP functions 7 Configuration prerequisites

More information

Deploying LISP Host Mobility with an Extended Subnet

Deploying LISP Host Mobility with an Extended Subnet CHAPTER 4 Deploying LISP Host Mobility with an Extended Subnet Figure 4-1 shows the Enterprise datacenter deployment topology where the 10.17.1.0/24 subnet in VLAN 1301 is extended between the West and

More information

HP 5920 & 5900 Switch Series

HP 5920 & 5900 Switch Series HP 5920 & 5900 Switch Series MPLS Configuration Guide Part number: 5998-4676a Software version: Release 23xx Document version: 6W101-20150320 Legal and notice information Copyright 2015 Hewlett-Packard

More information

Link layer: introduction

Link layer: introduction Link layer: introduction terminology: hosts and routers: nodes communication channels that connect adjacent nodes along communication path: links wired links wireless links LANs layer-2 packet: frame,

More information

Table of Contents. Cisco Introduction to EIGRP

Table of Contents. Cisco Introduction to EIGRP Table of Contents Introduction to EIGRP...1 Introduction...1 Before You Begin...1 Conventions...1 Prerequisites...1 Components Used...1 What is IGRP?...2 What is EIGRP?...2 How Does EIGRP Work?...2 EIGRP

More information

Configuring StackWise Virtual

Configuring StackWise Virtual Finding Feature Information, page 1 Restrictions for Cisco StackWise Virtual, page 1 Prerequisites for Cisco StackWise Virtual, page 2 Information About Cisco Stackwise Virtual, page 2 Cisco StackWise

More information

Multicast Communications

Multicast Communications Multicast Communications Multicast communications refers to one-to-many or many-tomany communications. Unicast Broadcast Multicast Dragkedja IP Multicasting refers to the implementation of multicast communication

More information

CS 268: Computer Networking. Taking Advantage of Broadcast

CS 268: Computer Networking. Taking Advantage of Broadcast CS 268: Computer Networking L-12 Wireless Broadcast Taking Advantage of Broadcast Opportunistic forwarding Network coding Assigned reading XORs In The Air: Practical Wireless Network Coding ExOR: Opportunistic

More information

Configuring multicast VPN

Configuring multicast VPN Contents Configuring multicast VPN 1 Multicast VPN overview 1 Multicast VPN overview 1 MD-VPN overview 3 Protocols and standards 6 How MD-VPN works 6 Share-MDT establishment 6 Share-MDT-based delivery

More information

Network Working Group. Category: Standards Track Juniper Networks J. Moy Sycamore Networks December 1999

Network Working Group. Category: Standards Track Juniper Networks J. Moy Sycamore Networks December 1999 Network Working Group Requests for Comments: 2740 Category: Standards Track R. Coltun Siara Systems D. Ferguson Juniper Networks J. Moy Sycamore Networks December 1999 OSPF for IPv6 Status of this Memo

More information

EEC-684/584 Computer Networks

EEC-684/584 Computer Networks EEC-684/584 Computer Networks Lecture 14 wenbing@ieee.org (Lecture nodes are based on materials supplied by Dr. Louise Moser at UCSB and Prentice-Hall) Outline 2 Review of last lecture Internetworking

More information

THE OSI MODEL. Application Presentation Session Transport Network Data-Link Physical. OSI Model. Chapter 1 Review.

THE OSI MODEL. Application Presentation Session Transport Network Data-Link Physical. OSI Model. Chapter 1 Review. THE OSI MODEL Application Presentation Session Transport Network Data-Link Physical OSI Model Chapter 1 Review By: Allan Johnson Table of Contents Go There! Go There! Go There! Go There! Go There! Go There!

More information

Lecture 3. The Network Layer (cont d) Network Layer 1-1

Lecture 3. The Network Layer (cont d) Network Layer 1-1 Lecture 3 The Network Layer (cont d) Network Layer 1-1 Agenda The Network Layer (cont d) What is inside a router? Internet Protocol (IP) IPv4 fragmentation and addressing IP Address Classes and Subnets

More information

Enhanced IGRP. Chapter Goals. Enhanced IGRP Capabilities and Attributes CHAPTER

Enhanced IGRP. Chapter Goals. Enhanced IGRP Capabilities and Attributes CHAPTER 40 CHAPTER Chapter Goals Identify the four key technologies employed by (EIGRP). Understand the Diffusing Update Algorithm (DUAL), and describe how it improves the operational efficiency of EIGRP. Learn

More information

ENTERPRISE MPLS. Kireeti Kompella

ENTERPRISE MPLS. Kireeti Kompella ENTERPRISE MPLS Kireeti Kompella AGENDA The New VLAN Protocol Suite Signaling Labels Hierarchy Signaling Advanced Topics Layer 2 or Layer 3? Resilience and End-to-end Service Restoration Multicast ECMP

More information

Forwarding Architecture

Forwarding Architecture Forwarding Architecture Brighten Godfrey CS 538 February 14 2018 slides 2010-2018 by Brighten Godfrey unless otherwise noted Building a fast router Partridge: 50 Gb/sec router A fast IP router well, fast

More information

Operation Administration and Maintenance in MPLS based Ethernet Networks

Operation Administration and Maintenance in MPLS based Ethernet Networks 199 Operation Administration and Maintenance in MPLS based Ethernet Networks Jordi Perelló, Luis Velasco, Gabriel Junyent Optical Communication Group - Universitat Politècnica de Cataluya (UPC) E-mail:

More information

Cisco Group Encrypted Transport VPN

Cisco Group Encrypted Transport VPN Cisco Group Encrypted Transport VPN Q. What is Cisco Group Encrypted Transport VPN? A. Cisco Group Encrypted Transport is a next-generation WAN VPN solution that defines a new category of VPN, one that

More information

IPv6: An Introduction

IPv6: An Introduction Outline IPv6: An Introduction Dheeraj Sanghi Department of Computer Science and Engineering Indian Institute of Technology Kanpur dheeraj@iitk.ac.in http://www.cse.iitk.ac.in/users/dheeraj Problems with

More information

Announcements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris

Announcements.  me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris Announcements Email me your survey: See the Announcements page Today Conceptual overview of distributed systems System models Reading Today: Chapter 2 of Coulouris Next topic: client-side processing (HTML,

More information

Measuring MPLS overhead

Measuring MPLS overhead Measuring MPLS overhead A. Pescapè +*, S. P. Romano +, M. Esposito +*, S. Avallone +, G. Ventre +* * ITEM - Laboratorio Nazionale CINI per l Informatica e la Telematica Multimediali Via Diocleziano, 328

More information

Routing Overview. Information About Routing CHAPTER

Routing Overview. Information About Routing CHAPTER 21 CHAPTER This chapter describes underlying concepts of how routing behaves within the ASA, and the routing protocols that are supported. This chapter includes the following sections: Information About

More information

Developing deterministic networking technology for railway applications using TTEthernet software-based end systems

Developing deterministic networking technology for railway applications using TTEthernet software-based end systems Developing deterministic networking technology for railway applications using TTEthernet software-based end systems Project n 100021 Astrit Ademaj, TTTech Computertechnik AG Outline GENESYS requirements

More information

What is an L3 Master Device?

What is an L3 Master Device? What is an L3 Master Device? David Ahern Cumulus Networks Mountain View, CA, USA dsa@cumulusnetworks.com Abstract The L3 Master Device (l3mdev) concept was introduced to the Linux networking stack in v4.4.

More information

Syed Mehar Ali Shah 1 and Bhaskar Reddy Muvva Vijay 2* 1-

Syed Mehar Ali Shah 1 and Bhaskar Reddy Muvva Vijay 2* 1- International Journal of Basic and Applied Sciences Vol. 3. No. 4 2014. Pp. 163-169 Copyright by CRDEEP. All Rights Reserved. Full Length Research Paper Improving Quality of Service in Multimedia Applications

More information

MPLS VPN Inter-AS Option AB

MPLS VPN Inter-AS Option AB First Published: December 17, 2007 Last Updated: September 21, 2011 The feature combines the best functionality of an Inter-AS Option (10) A and Inter-AS Option (10) B network to allow a Multiprotocol

More information

Table of Contents 1 Static Routing Configuration RIP Configuration 2-1

Table of Contents 1 Static Routing Configuration RIP Configuration 2-1 Table of Contents 1 Static Routing Configuration 1-1 Introduction 1-1 Static Route 1-1 Default Route 1-1 Application Environment of Static Routing 1-1 Configuring a Static Route 1-2 Configuration Prerequisites

More information

Configuring IP Multicast Routing

Configuring IP Multicast Routing 34 CHAPTER This chapter describes how to configure IP multicast routing on the Cisco ME 3400 Ethernet Access switch. IP multicasting is a more efficient way to use network resources, especially for bandwidth-intensive

More information

Operation Manual IPv4 Routing H3C S3610&S5510 Series Ethernet Switches. Table of Contents

Operation Manual IPv4 Routing H3C S3610&S5510 Series Ethernet Switches. Table of Contents Table of Contents Table of Contents Chapter 1 Static Routing Configuration... 1-1 1.1 Introduction... 1-1 1.1.1 Static Route... 1-1 1.1.2 Default Route... 1-1 1.1.3 Application Environment of Static Routing...

More information

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies Data Link Layer Our goals: understand principles behind data link layer services: link layer addressing instantiation and implementation of various link layer technologies 1 Outline Introduction and services

More information

Outline. CS5984 Mobile Computing. Host Mobility Problem 1/2. Host Mobility Problem 2/2. Host Mobility Problem Solutions. Network Layer Solutions Model

Outline. CS5984 Mobile Computing. Host Mobility Problem 1/2. Host Mobility Problem 2/2. Host Mobility Problem Solutions. Network Layer Solutions Model CS5984 Mobile Computing Outline Host Mobility problem and solutions IETF Mobile IPv4 Dr. Ayman Abdel-Hamid Computer Science Department Virginia Tech Mobile IPv4 1 2 Host Mobility Problem 1/2 Host Mobility

More information

Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises

Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises Full paper available at http://www.cs.princeton.edu/~chkim Changhoon Kim, Matthew Caesar, and Jennifer Rexford Outline of Today

More information

Open Shortest Path First (OSPF)

Open Shortest Path First (OSPF) CHAPTER 42 Open Shortest Path First (OSPF) Background Open Shortest Path First (OSPF) is a routing protocol developed for Internet Protocol (IP) networks by the interior gateway protocol (IGP) working

More information

LECTURE 8. Mobile IP

LECTURE 8. Mobile IP 1 LECTURE 8 Mobile IP What is Mobile IP? The Internet protocol as it exists does not support mobility Mobile IP tries to address this issue by creating an anchor for a mobile host that takes care of packet

More information

HP Load Balancing Module

HP Load Balancing Module HP Load Balancing Module High Availability Configuration Guide Part number: 5998-2687 Document version: 6PW101-20120217 Legal and notice information Copyright 2012 Hewlett-Packard Development Company,

More information

Outline. CS6504 Mobile Computing. Host Mobility Problem 1/2. Host Mobility Problem 2/2. Dr. Ayman Abdel-Hamid. Mobile IPv4.

Outline. CS6504 Mobile Computing. Host Mobility Problem 1/2. Host Mobility Problem 2/2. Dr. Ayman Abdel-Hamid. Mobile IPv4. CS6504 Mobile Computing Outline Host Mobility problem and solutions IETF Mobile IPv4 Dr. Ayman Abdel-Hamid Computer Science Department Virginia Tech Mobile IPv4 1 2 Host Mobility Problem 1/2 Host Mobility

More information

Configuring Rapid PVST+

Configuring Rapid PVST+ This chapter describes how to configure the Rapid per VLAN Spanning Tree (Rapid PVST+) protocol on Cisco NX-OS devices using Cisco Data Center Manager (DCNM) for LAN. For more information about the Cisco

More information

Lab 4: Routing using OSPF

Lab 4: Routing using OSPF Network Topology:- Lab 4: Routing using OSPF Device Interface IP Address Subnet Mask Gateway/Clock Description Rate Fa 0/0 172.16.1.17 255.255.255.240 ----- R1 LAN R1 Se 0/0/0 192.168.10.1 255.255.255.252

More information

MODELS OF DISTRIBUTED SYSTEMS

MODELS OF DISTRIBUTED SYSTEMS Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between

More information

Contents. EVPN overview 1

Contents. EVPN overview 1 Contents EVPN overview 1 EVPN network model 1 MP-BGP extension for EVPN 2 Configuration automation 3 Assignment of traffic to VXLANs 3 Traffic from the local site to a remote site 3 Traffic from a remote

More information

IPv6 PIM. Based on the forwarding mechanism, IPv6 PIM falls into two modes:

IPv6 PIM. Based on the forwarding mechanism, IPv6 PIM falls into two modes: Overview Protocol Independent Multicast for IPv6 () provides IPv6 multicast forwarding by leveraging static routes or IPv6 unicast routing tables generated by any IPv6 unicast routing protocol, such as

More information

Configuring STP and RSTP

Configuring STP and RSTP 7 CHAPTER Configuring STP and RSTP This chapter describes the IEEE 802.1D Spanning Tree Protocol (STP) and the ML-Series implementation of the IEEE 802.1W Rapid Spanning Tree Protocol (RSTP). It also explains

More information

Configuring IP Multicast Routing

Configuring IP Multicast Routing 39 CHAPTER This chapter describes how to configure IP multicast routing on the Catalyst 3560 switch. IP multicasting is a more efficient way to use network resources, especially for bandwidth-intensive

More information

Finish Network Layer Start Transport Layer. CS158a Chris Pollett Apr 25, 2007.

Finish Network Layer Start Transport Layer. CS158a Chris Pollett Apr 25, 2007. Finish Network Layer Start Transport Layer CS158a Chris Pollett Apr 25, 2007. Outline OSPF BGP IPv6 Transport Layer Services Sockets Example Socket Program OSPF We now look at routing in the internet.

More information

Chapter Motivation For Internetworking

Chapter Motivation For Internetworking Chapter 17-20 Internetworking Part 1 (Concept, IP Addressing, IP Routing, IP Datagrams, Address Resolution 1 Motivation For Internetworking LANs Low cost Limited distance WANs High cost Unlimited distance

More information

Netlink2 as ForCES protocol (update)

Netlink2 as ForCES protocol (update) 57 th IETF, Juy 14 th, 2003 Netlink2 as ForCES protocol (update) draft-jhsrha-forces-netlink2-01.txt presentation available online at http://www.zurich.ibm.com/~rha/netlink2-1.pdf Robert Haas, IBM Research

More information

Table of Contents 1 OSPF Configuration 1-1

Table of Contents 1 OSPF Configuration 1-1 Table of Contents 1 OSPF Configuration 1-1 Introduction to OSPF 1-1 Basic Concepts 1-2 OSPF Area Partition 1-4 Router Types 1-7 Classification of OSPF Networks 1-9 DR and BDR 1-9 OSPF Packet Formats 1-11

More information

Cisco ASR 1000 Series Aggregation Services Routers: QoS Architecture and Solutions

Cisco ASR 1000 Series Aggregation Services Routers: QoS Architecture and Solutions Cisco ASR 1000 Series Aggregation Services Routers: QoS Architecture and Solutions Introduction Much more bandwidth is available now than during the times of 300-bps modems, but the same business principles

More information

Chapter 5 Network Layer: The Control Plane

Chapter 5 Network Layer: The Control Plane Chapter 5 Network Layer: The Control Plane A note on the use of these Powerpoint slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you

More information

6 MPLS Model User Guide

6 MPLS Model User Guide 6 MPLS Model User Guide Multi-Protocol Label Switching (MPLS) is a multi-layer switching technology that uses labels to determine how packets are forwarded through a network. The first part of this document

More information

ET4254 Communications and Networking 1

ET4254 Communications and Networking 1 Topic 9 Internet Protocols Aims:- basic protocol functions internetworking principles connectionless internetworking IP IPv6 IPSec 1 Protocol Functions have a small set of functions that form basis of

More information

A MAC Layer Abstraction for Heterogeneous Carrier Grade Mesh Networks

A MAC Layer Abstraction for Heterogeneous Carrier Grade Mesh Networks ICT-MobileSummit 2009 Conference Proceedings Paul Cunningham and Miriam Cunningham (Eds) IIMC International Information Management Corporation, 2009 ISBN: 978-1-905824-12-0 A MAC Layer Abstraction for

More information

Data Plane Monitoring in Segment Routing Networks Faisal Iqbal Cisco Systems Clayton Hassen Bell Canada

Data Plane Monitoring in Segment Routing Networks Faisal Iqbal Cisco Systems Clayton Hassen Bell Canada Data Plane Monitoring in Segment Routing Networks Faisal Iqbal Cisco Systems (faiqbal@cisco.com) Clayton Hassen Bell Canada (clayton.hassen@bell.ca) Reference Topology & Conventions SR control plane is

More information

Lecture 2: Basic routing, ARP, and basic IP

Lecture 2: Basic routing, ARP, and basic IP Internetworking Lecture 2: Basic routing, ARP, and basic IP Literature: Forouzan, TCP/IP Protocol Suite: Ch 6-8 Basic Routing Delivery, Forwarding, and Routing of IP packets Connection-oriented vs Connectionless

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Better Approach To Mobile Adhoc Networking

Better Approach To Mobile Adhoc Networking Better Approach To Mobile Adhoc Networking batman-adv - Kernel Space L2 Mesh Routing Martin Hundebøll Aalborg University, Denmark March 28 th, 2014 History of batman-adv The B.A.T.M.A.N. protocol initiated

More information

Global IP Network System Large-Scale, Guaranteed, Carrier-Grade

Global IP Network System Large-Scale, Guaranteed, Carrier-Grade Global Network System Large-Scale, Guaranteed, Carrier-Grade 192 Global Network System Large-Scale, Guaranteed, Carrier-Grade Takanori Miyamoto Shiro Tanabe Osamu Takada Shinobu Gohara OVERVIEW: traffic

More information

Configuring Rapid PVST+ Using NX-OS

Configuring Rapid PVST+ Using NX-OS Configuring Rapid PVST+ Using NX-OS This chapter describes how to configure the Rapid per VLAN Spanning Tree (Rapid PVST+) protocol on Cisco NX-OS devices. This chapter includes the following sections:

More information

Integrated Services. Integrated Services. RSVP Resource reservation Protocol. Expedited Forwarding. Assured Forwarding.

Integrated Services. Integrated Services. RSVP Resource reservation Protocol. Expedited Forwarding. Assured Forwarding. Integrated Services An architecture for streaming multimedia Aimed at both unicast and multicast applications An example of unicast: a single user streaming a video clip from a news site An example of

More information

UNIT IV -- TRANSPORT LAYER

UNIT IV -- TRANSPORT LAYER UNIT IV -- TRANSPORT LAYER TABLE OF CONTENTS 4.1. Transport layer. 02 4.2. Reliable delivery service. 03 4.3. Congestion control. 05 4.4. Connection establishment.. 07 4.5. Flow control 09 4.6. Transmission

More information

Introduction to MPLS APNIC

Introduction to MPLS APNIC Introduction to MPLS APNIC Issue Date: [201609] Revision: [01] What is MPLS? 2 Definition of MPLS Multi Protocol Label Switching Multiprotocol, it supports ANY network layer protocol, i.e. IPv4, IPv6,

More information

Configure SR-TE Policies

Configure SR-TE Policies This module provides information about segment routing for traffic engineering (SR-TE) policies, how to configure SR-TE policies, and how to steer traffic into an SR-TE policy. About SR-TE Policies, page

More information