On the use of connection-oriented networks to support Grid computing

On the use of connection-oriented networks to support Grid computing Malathi Veeraraghavan, Xuan Zheng, Zhanxiang Huang University of Virginia {malathi, xuan, zh4c}@virginia.edu Abstract -- The vision of Grid Computing is to enable an arbitrary set of general-purpose computers to be recruited dynamically and interconnected through a general-purpose network for the parallel execution of complex programs. The scale and ubiquity of the Internet makes it the natural network of choice for Grid computing. However, for some applications, there is a need for rate-/delay-guaranteed communication services. These needs are driving the exploration of connection-oriented (CO) optical networks as a candidate for Grid computing. In this paper, we consider the suitability of CO networks equipped with Generalized MultiProtocol Label Switching (GMPLS) control-plane protocols for Grid Computing. We identify two areas in which current GMPLS implementations need enhancements to better support the needs of Grid Computing. First, we note a need to improve call setup delays by several orders of magnitude. We describe our proof-of-concept prototype implementation of a hardware-accelerated RSVP-TE engine that cuts setup delays from hundreds of milliseconds typical in current equipment to the order of microseconds. Second, noting the availability of different types of CO networks, we present a case for enhancements to control heterogeneous connections, i.e., connections that traverse multiple types of CO networks. We describe a distributed signaling procedure for the setup of such connections. I. INTRODUCTION The vision of Grid Computing is to enable an arbitrary set of computers to be pulled together over a general-purpose network such as the Internet to cooperatively run a computation. While Grid Computing started as a project to link supercomputing sites [1], current work on Grid computing aims to interconnect computers from different administrative domains on an as-needed basis. Adaptability or dynamicity, scalability and heterogeneity are important characteristics of a Grid [1]. Further the ability to deliver service at various levels of quality (for metrics such as response time) is listed in [2] as a key characteristic of a Grid. Thus any networking technology proposed to serve the needs of Grid computing should take into account these goals. Connection-Oriented (CO) networks are better equipped to deliver rate-/delay-guaranteed services than the existing connectionless Internet. Typically, the metric traded off to achieve rate/delay guarantees is network resource utilization. However, a networking solution that has lax requirements on utilization will be unscalable. Given the adaptability and scalability goals of the computing Grid, CO networks should be both dynamically configurable and scalable. Control-plane protocols developed for CO networks under the umbrella term Generalized MultiProtocol Label Switching (GMPLS) enable both dynamic configurability and scalability. The purpose of these protocols is to support on-demand requests for connectivity at a specified bandwidth level. Applications could request such connectivity as needed and release the bandwidth when done. While switches deployed in today s Internet have the data-plane capabilities needed to support CO services, most are not equipped with dynamic bandwidth provisioning control-plane capabilities. For example, Ethernet switches deployed within enterprises are now equipped with the IEEE 802.1q protocol, which allows for the creation of quality-of-service guaranteed virtual circuits (VCs) in the form of Virtual LANs (VLANs). Similarly, MANs and WANs are built with CO (circuit-switched) Synchronous Optical Network (SONET)/Synchronous Digital Hierarchy (SDH) switches. However control-plane capability upgrades are required to enable these networks to support dynamic bandwidth provisioning without which the adaptability and scalability goals of Grid computing cannot be met. In Section II, we review the GMPLS control-plane protocols and describe the CO capabilities of existing switches. We also briefly review the state of existing networks with regards to their CO service offerings. In Section III, we evaluate the suitability of GMPLS control-plane equipped CO networks for Grid computing. On the positive side, we note that the control-plane support for dynamic bandwidth provisioning and its specification for a distributed implementation are key factors that will address the adaptability and scalability goals of Grid computing. On the negative side, we note a need for faster signaling protocol implementations to reduce the overhead of circuit/vc setup delay and the lack of support in the current control-plane specifications for heterogeneous connections, i.e., those that traverse different types of CO networks. In Section IV, we discuss our research contributions towards these missing components. Specifically, we describe our work on hardware-accelerated implementations of signaling protocols, and procedure for setting up heterogeneous con- 1

nections. The ability to support connections through different types of CO networks is important given the scalability goals of Grid computing. We conclude the paper in Section V. II. BACKGROUND In this section, we review different types of GMPLS networks and control-plane protocols. We then briefly describe existing equipment and currently available networks in which CO services can be enabled. II.A CO networks and GMPLS control-plane protocols CO networks are of two types: packet-switched and circuitswitched. Packet-switched CO networks include: - Intserv IP networks [3] (in which virtual circuits are identified at switches by the five tuple of source and destination IP addresses in IP headers, source and destination port numbers in transport-layer protocol headers and IP header protocol type), - MultiProtocol Label Switched (MPLS) [4] and Asynchronous Transfer Mode (ATM) networks, and - IEEE 802.1p and 802.1q Virtual LAN (VLAN) Ethernet switch based networks [5]. Circuit-switched networks include: - Time-Division Multiplexed (TDM) SONET/SDH networks, - All-optical Wavelength Division Multiplexed (WDM) networks, - Space-Division Multiplexed (SDM) Ethernet switch based networks (an SDM connection is created by mapping two ports into an untagged VLAN). The GMPLS control-plane protocols are defined as a common control plane for these different types of CO networks even though their data-plane protocols differ significantly. This common control plane consists of: 1. Link Management Protocol (LMP) [6], 2. Open Shortest Path First - Traffic Engineering (OSPF-TE) routing protocol [7], and 3. Resource Reservation Protocol - Traffic Engineering (RSVP-TE) signaling protocol [8]. These three protocols are designed to be implemented in a control processor at each network switch. Each of these protocols provides an increasing degree of automation, and correspondingly decreasing dependence upon manual network administration. This triple combination serves as an excellent basis on which to create large-scale CO networks, in which switches can cooperate in a completely automated fashion to respond to requests for end-to-end bandwidth. We consider each protocol in a little more detail below, starting with LMP. The main purpose of the LMP module is to automatically establish and manage the control channels between adjacent nodes, to discover and verify data-plane connectivity, and to correlate data-plane link properties. In GMPLS networks, there could be multiple data-plane links between two adjacent nodes and the control channel could be established on a separate physical link from any of the data-plane links. A mechanism is required to automatically discover these data-plane links, verify their properties, combine them into a single traffic-engineering (TE) link, and correlate data-plane links to the control channel. Thus, LMP contributes to our plug-and-play goal for CO networks minimizing manual administration. The purpose of the OSPF-TE routing protocol software module located at a switch is to enable the switch to send topology, reachability and loading conditions of its interfaces to other switches, and receive corresponding information from other switches. This data dissemination process allows the route computation module at the switch to determine the nexthop switch toward which to direct a connection setup (this module could be part of the signaling protocol module or could be used to precompute routing data ahead of when call setup requests arrive). As a routing protocol, its value in creating large-scale connectionless networks has already been observed with the success of the Internet. Admittedly, being a link-state protocol, it is only used intra-domain (i.e., within the network of an organization, referred to as an autonomous system (AS)), and even within this context, it uses a two-layer hierarchy of areas and a backbone area. In conjunction with the distancevector based inter-domain routing protocol, Border Gateway Protocol (BGP), we have a highly decentralized automated mechanism to spread routing information, which was critical to the scaling of the Internet. Finally, the purpose of an RSVP-TE signaling engine at a switch is to manage the bandwidth of all the interfaces on the switch, and to program the data-plane switch hardware to enable it to forward demultiplexed incoming user bits or packets as and when they arrive. An RSVP-TE engine implemented in a control card at a switch executes three steps when it receives a connection setup Path message (i.e., a request for bandwidth), as show in Figure 1. 1. Route computation Based on the destination address to which the connection is requested (D, in the example shown in Figure 1), the RSVP-TE engine determines the next-hop 2

with input interfaces to outgoing labels on appropriate outgoing interfaces. In packet switches, there is an additional step to program the scheduler to enable it to serve packets arriving on the virtual circuit being set up at the requested bandwidth level. We do not show the rest of the call setup procedure in Figure 1, the continuation of the Path message propagation hopby-hop, as well as the Resv message returning back in the reverse direction, which implicitly confirms successful connection setup. Detailed procedures are also defined in RSVP-TE for when call setup fails. Figure 1. Distributed call setup process progressing hop-by-hop switch toward which to route the connection or a subset of switches on the end-to-end path within its area of its domain. Constrained Shortest Path First (CSPF) algorithms can only be executed intra-area because of the intra-area scope of bandwidth related parameters in OSPF-TE messages. 2. Bandwidth and label management If the switch is in a position to only compute the next-hop switch in the route computation phase, then it needs to check if there is sufficient bandwidth on a link connected to the next-hop switch. If it performed CSPF to determine a part of the end-to-end route, i.e., the subset of switches on the path within its area of its domain, then this step of bandwidth management is integrated with the partial route computation. But at subsequent switches within the area, this step is required to check if there is sufficient bandwidth available on the link to the next-hop indicated in the partial source route passed within the Path signaling message (see Figure 1 for how Path messages travel hop-by-hop). This is because local conditions can change between the last routing protocol update, which provided the data used in the CSPF computation, and the arrival of the call being set up. Typical implementations use a call blocking approach in which calls are simply rejected if sufficient bandwidth is not available, and no existing call is preemptable. Label management is the selection of labels to be used on the incoming and outgoing interfaces of the switch. Labels can be explicit in the data plane, e.g., labels used within packet headers in virtual circuit (VC) networks, or implicit, e.g., time slots, wavelengths or interface identifiers in TDM, WDM and SDM networks, respectively. In the control-plane, labels are explicit in packet and circuit switched networks with the labels identifying timeslots, wavelengths and interface identifiers to be used for the connection across a circuit switch. These labels are used in the next step. 3. Switch fabric configuration This step is needed to configure the switch fabric to set it up to forward user data as and when it arrives. This function maps incoming labels associated II.B Existing switches, gateways and networks The most common network switches today are Ethernet switches, IP routers and SONET/SDH switches. The first two are primarily connectionless packet switches. However, increasingly, Ethernet switches have VLAN capabilities with limited quality-of-service support. A VLAN is constructed by programming the switch to include two or more ports. It can be tagged or untagged. In tagged mode, all Ethernet frames are tagged with a VLAN header that includes a VLAN ID. Frames tagged with the same VLAN ID are treated in the same manner, i.e., they are forwarded to all the ports belonging to that VLAN. An untagged VLAN with two ports is essentially a space-division multiplexed circuit because all Ethernet frames arriving on either port are sent exclusively to the other port. No frames arriving on other ports are forwarded to ports in an untagged VLAN. Ethernet switches from Extreme Networks, Dell, Cisco, Intel, Foundry and Force 10, just to name a few vendors, have these capabilities. Thus, the data-plane capabilities required to create circuits/vcs through Ethernet switches are now available. Control-plane software to dynamically set up and release circuits is however not implemented within these switches. The Dragon project has developed a software module called the Virtual Label Switch Router (VLSR), which implements the RSVP-TE and OSPF-TE protocols [9]. It runs on an external Linux host connected to the Ethernet switch whose bandwidth it manages. It issues Simple Network Management Protocol (SNMP) commands to create the VLANs for admitted connections [10]. With this external software, the Ethernet switches become fully equipped CO switches. IP routers are equipped with MPLS engines and RSVP- TE signaling software for dynamic control of MPLS virtual circuits. Both Cisco and Juniper routers support MPLS. SONET/SDH and WDM switches are circuit switches in which timeslots and/or wavelengths are mapped from incoming to outgoing interfaces. Some of these switches now support RSVP-TE and OSPF-TE control-plane implementations. For 3

example, Sycamore SONET switches implement these protocols. Examples of WDM switches that implement GMPLS control-plane protocols include Movaz and Calient WDM equipment. In addition to supporting pure CO switching functionality, some of this equipment can be used as gateways to interconnect different types of networks. Before describing the gateway functionality of these pieces of equipment, we establish some terminology. We define the term network to consist of switches and endpoints (data-sourcing and sinking entities) interconnected by shared communication links, on which the sharing (multiplexing) mechanism is the same on all links. Further we define the term switch as an entity in which all links (interfaces) support the same (single) form of multiplexing (referred to as switching capability in [11]). For example, a SONET switch is one in which all interfaces carry TDM signals formatted according to the SONET multiplexing standards, and a SONET network is one in which all the switches are SONET switches. Typical endpoints in a SONET network are IP routers with SONET line cards; these nodes are endpoints in the SONET network as they source and sink data carried on to the SONET network. We use the term internetwork to denote an interconnection of networks (referred to as multi-region networks in [11]). Entities (nodes) that interconnect networks necessarily need the ability to support interfaces with different types of multiplexing capabilities, minimally two. We use the term gateways to refer to such nodes. An IP router is a packetbased gateway in the connectionless Internet with different line cards implementing the protocols of the networks to which they are connected. The gateway functionality is achieved by the IP implementation within the router examining IP datagram headers to determine how to route a packet from an incoming network to an appropriate outgoing network. In contrast, gateways in a connection-oriented internetwork move data from one network to another using circuit/vc techniques. For example, Ethernet cards in a Sycamore SN16000 implement the Generic Framing Procedure (GFP) Ethernet-to-SONET encapsulation to map all frames received on any of its Ethernet ports into a port on a SONET line card, to which it is crossconnected. We thus refer to these gateways as circuit/vc gateways to contrast them with packet-based gateways. An example of a VC gateway is a Cisco GSR 12008, which supports line cards that can be programmed to map all frames arriving on a specific VLAN into an MPLS tunnel set up on one of its other ports. It thus interconnects a VLAN based CO network to an MPLS based CO network. While the data-plane capabilities for extracting data from one type of multiplexed connection and sending it on to a different type of multiplexed connection are available, the control-plane capabilities for controlling such circuits/vcs are not yet standardized, and hence not implemented. Finally, as for current CO network deployments, SONET/SDH and WDM networks are already in widespread deployment. However, the dynamic bandwidth provisioning capability supported by the GMPLS control-plane protocols, while available in some deployed switches, is not yet made available to users. Similarly, the Abilene backbone of Internet2 and DOE s ESnet has routers with built-in MPLS and RSVP- TE capabilities. There are ongoing research projects [12][13] to enable the use of dynamically requested virtual circuits through these networks. Examples of experimental CO network testbeds include CHEETAH [14], a SONET based network, and DRAGON [9], a WDM based network. III. ARE CO NETWORKS SUITABLE FOR GRID COMPUTING? III.A On the positive side In Section I, we noted adaptability/dynamicity, scalability, heterogeneity, the ability to span different administrative domains and the support for various quality-of-service levels as key attributes of Grid computing. Specifically, the last attribute of guaranteed service quality is served well with CO networking. For the communication aspect of the service, delay/jitter can be guaranteed once a circuit/vc is established. To offer this ability to establish a circuit/vc between any two computers or clusters, bandwidth sharing on the CO network must be dynamic. In other words, an application program running on a computer should be able to dynamically request a circuit to a distant computer and have this request filled cooperatively by the CO network switches on the end-toend path between these computers. GMPLS control-plane protocols define the procedures for the handling such on-demand calls, i.e., immediate requests for connectivity at a guaranteed rate. The adaptability/dynamicity feature of Grids makes support for immediate requests for bandwidth necessary in a CO network. The scalability attribute of Grids requires that CO networks that serve Grids be scalable. For example, while a centralized bandwidth management approach could be implemented to receive, process and grant dynamic requests for bandwidth/ connectivity, it would not be scalable. The RSVP-TE and 4

OSPF-TE protocols of the GMPLS control plane are designed for distributed implementation at each switch. Bandwidth of a switch s interfaces is locally managed by the switch s RSVP- TE engine. Procedures for the cooperation of switches on the end-to-end path to ensure the availability of the requested bandwidth on all links of the circuit are defined as part of the protocols. Thus GMPLS control-plane protocols (1) enable largescale CO networks to be created, and (2) enable these networks to respond to on-demand requests for rate-guaranteed connectivity. Both these features make GMPLS based CO networks well suited to serve Grids. III.B On the negative side First, we argue for faster RSVP-TE implementations. Our laboratory experiments with commercial RSVP-TE implementations show call processing delays 166ms per switch [15]. Contrast this with the time taken to send a burst of data from one module of a parallel program to a remote module. If the burst is 100MB and the circuit rate is 1Gbps, the transfer itself will only take 800ms. If the end-to-end path has four switches, then the call setup delay is at least as long as the transfer delay. This results in a 50% reduction of link utilization. Holding the circuit open in anticipation of further bursts would be useful only if the parallel program is communication-intensive. For computationally intensive applications, holding open such circuits means denying other users bandwidth for their applications. Running networks at low utilization will result in low amortizations of operational expenses, which in turn will discourage the growth of these networks. The faster the response times of signaling engines, the lower the cost to an application to release and reacquire bandwidth as and when needed. This will lead to increased sharing, reduce per-user costs and thus improve prospects for growth. Noting the importance of scalability to Grids, we regard this improvement in the performance of RSVP-TE implementations necessary to the success of CO networks. Second, consider the attributes of heterogeneity and the need to support Grids across various administrative domains. This in turn means that circuits/vcs used to interconnect computers in a Grid are highly likely to traverse different types of CO networks. The set of GMPLS protocols standardized today enable the control of homogeneous circuits/vcs rather than heterogeneous ones. With the dominance of Ethernet/VLANs in LANs, the dominance of SONET in commercial MANs/ WANs, the deployment of new optical circuit-switched WDM based networks, and the dominance of MPLS in Internet2/ ESnet, the need to extend GMPLS protocols to support heterogeneous connections through an internetwork of different types of CO networks becomes paramount if CO networking is to succeed in meeting the needs of a global computing Grid. Other aspects lacking in GMPLS control-plane protocols include the ability to make advance reservations for bandwidth and hooks for security related functions, such as authorization. In the next section we describe our contributions to the first two components noted above, and delegate these other aspects to future publications. IV. OUR CONTRIBUTIONS In this section, we describe our work on a hardware-accelerated RSVP-TE implementation, and propose techniques for using GMPLS control-plane protocols to support heterogeneous connections. IV.A Fast call setup with hardware-accelerated signaling implementations Call setup delay consists of three components: (1) roundtrip propagation delay to send call setup signaling messages, (2) call processing delays at each switch along the path of the connection to receive and process call setup messages, and (3) transmission delay for the call setup signaling messages. The first component, round-trip propagation delay, should be considered in selecting the computers to be included as part of a Grid computation. To reduce call processing delays, we implemented a subset of RSVP-TE for SONET/SDH networks in hardware. Our approach is to implement only the time-critical operations of the signaling protocol in hardware, and relegate the non-timecritical operations to software. We described the results of this work in [16], in which we showed that call setup delay per switch can be reduced to 4µ s. These results are for an electronic high-speed switch in which switch programming delay (step 3 in the procedure reviewed in Section II) is in the order of nanoseconds. If MicroElectroMechanical (MEMS) all-optical switches are used, the switch programming delay is itself in the order of 3ms. Since electronic TDM (SONET) switches or SDM switches (Ethernet switches with ports mapped to an untagged VLAN) can handle 1Gbps to 10Gbps circuit rates, we recommend the use of these switches (at least in the near term until fast-programmable optical switches become available). Finally, signaling message transmission delay per hop is also in microseconds because call setup RSVP-TE Path and Resv messages are on the order of 100-200bytes. If the control 5

channels are at least 100Mbps (out-of-band signaling over the Internet is expected to be the most common control-plane implementation in CO networks), the per-hop transmission delay is 8-16µ s. By reducing call processing and signaling message transmissions delays to microseconds, the dominant factor in total call setup delay will be propagation delay, especially in widearea Grids. This will be a significant improvement from current call setup delays. As described in Section III.B, such significant reductions in call setup delays will encourage applications to release circuits whenever unused, thus increasing utilization, decreasing per-user costs and encouraging growth. IV.B Heterogeneous connections We start by considering the data-plane aspect of heterogeneous connections in Section IV.B.1. What happens in the data plane at the demarcation points between networks, i.e., at gateways, in contrast to what happens at switches within a network? Recall our definitions of the terms switch and gateway from Section II. In Section IV.B.2, we describe a method to set up heterogeneous connections using GMPLS control-plane protocols. IV.B.1. Data-plane aspect At a switch, whether circuit or packet, arriving data streams are demultiplexed in input line cards, and data from each demultiplexed stream is forwarded to an appropriate output line card through a space fabric whose lines has been appropriately crossconnected. At the output line card the stream is multiplexed along with all other demultiplexed streams forwarded to the same output line card before transmission. The protocol used for multiplexing is the same on the input and output interfaces. While there may be a modification of the value of the identifier used to identify a particular connection within the multiplexed data stream on the input and output interfaces, there is no stripping out of protocol layer headers corresponding to the multiplexing protocol. For example, at an MPLS switch, the MPLS label value, which identifies a virtual circuit on the incoming interface can be different from the MPLS label value used to identify the same virtual circuit on the outgoing interface. Similarly data arriving on timeslot STS-20 of an OC192 SONET interface may be transmitted out at the STS-21 position of an outgoing OC192 interface. In contrast, at a gateway, additional circuitry is required to forward data arriving on an interface that uses one type of multiplexing to an outgoing interface that uses a different type of multiplexing. There are two possible solutions on how to do this in the data plane: 1. Terminate the connections on the input and output interfaces at the gateway, extract the payload being carried in the incoming connection and send it on to the outgoing connection. We refer to this solution as protocol-converted connections, where the protocol being converted is the multiplexing protocol. 2. Encapsulate data framed according to the multiplexing protocol format on the input interface onto an outgoing connection. We refer to this solution as protocol-encapsulated connections where the protocol being encapsulated is the multiplexing protocol on the input interface. One example of a CO gateway is a Cisco or Juniper IP router/mpls switch. We show the architecture of such a gateway in Figure 2. An input Ethernet/VLAN line card has the capability to de-multiplex the Ethernet frames and extract the VLAN labels carried in Ethernet header. Ethernet frames belonging to the same VLAN are encapsulated as payload of an MPLS tunnel by adding the same MPLS label if that line card is programmed to support Ethernet over MPLS in VLAN mode [17]. This is illustrated on the top input line card in Figure 2. On the other hand, if a line card is programmed to operate in an Ethernet over MPLS in port mode [17], then effectively the multiplexing scheme on that line card is spacedivision multiplexing. We illustrate this case in the bottom input line card in Figure 2, where irrespective of their VLAN labels, all frames are encapsulated into MPLS packets with the same MPLS label value in the header. At the output line card, the packets with new MPLS labels from the input side are then multiplexed along with MPLS encapsulated packets from other input line cards that are forwarded to the same output line card. Reference [11] proposes a parameter called Interface Adaptation Capability Descriptor (IACD) to describe the encapsulating functionality available on each interface. For example, Figure 3 shows the IACD for the top input interface in Figure 2. The first byte of IACD represents the lower-level multiplexing capability, which is shown to be Packet Switching Capability (PSC), a notation used to represent MPLS multiplexing, and the third byte of IACD represents the upper-level multiplexing capability, which is shown to be Layer2 Switching Capability (L2SC), a notation used to represent VLAN multiplexing. The third byte in the IACD for the bottom input line card will be Fiber Switching Capability (FSC) instead of L2SC since it is set to operate in an Ethernet over MPLS in port mode. 6

B D VLAN label A C Figure 2. VLAN Line card Port mapped Line card 2 B 3 D MPLS label... 1 A 3 C 1 A Switch controller Line card/ multiplexer Switching fabric... Architecture of a L2SC-PSC gateway Line card/ multiplexer 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ PSC Packet L2SC/VLAN Ethernet +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3. IACD format for the interface with VLAN- MPLS adaptation capability This new parameter is sufficient to capture the functionality need to support protocol-encapsulated connections. It can also be used to support protocol-converted connections. Consider the example in Figure 3. The IACD indicates that this interface has the capability to terminate the lower-layer MPLS connection, which means this particular interface has the capability to extract the VLAN payload from incoming MPLS connections. If the IACD of an input interface and an output interface has the same higher-layer multiplexing capability but different lower-layer multiplexing capabilities, then a protocol-converted connection can be routed across these two interfaces. IV.B.2. Control-plane aspect Here we consider the question of how RSVP-TE Path message parameters should be set for a heterogeneous connection. Start with end host, I, initiating a request for bandwidth to a destination host, D. Given today s networks, end host I can request one of three types of connections, a space-division multiplexed connection in which its whole Ethernet NIC is dedicated for its communication with D, a VLAN-multiplexed connection, or an Intserv IP multiplexed connection. It however has no information on the types of connections supported by host D. Therefore it requests any of the three types of connections that suits its need and sets the destination address of the connection to D. If host I is connected to a switch, then presumably the type of connection requested (determined by the multiplexing type) matches the multiplexing type supported by the switch on all its interfaces (given our definition of the term switch ). This switch then progresses the intra-network connection setup procedure along the lines described in Section II. On the other hand, if host I is connected to a gateway, but the connection needs to be routed along an interface that uses the same type of multiplexing as the interface from host I, then again, intra-network connection setup procedures are followed. But if the connection needs to be routed along an interface that uses a different type of multiplexing scheme, then some gateway functions have to be executed. These functions consist of the gateway determining the far-end of the new type of connection it needs to be set up for the different multiplexing scheme. We make an assumption that the I to D connection is an intra-area connection. This allows us to describe a procedure in which the gateway can determine the multiplexing capabilities and adaptation capabilities of interfaces at all switches within the area (due to lack of space, we do not go into the details of how OSPF-TE supports the spreading of such data; the reader is referred to [7] and [11]). We use an example to illustrate our concept for heterogeneous connection setup. Figure 4 shows an example of a protocol-encapsulated heterogeneous connection setup procedure passing through an MPLS network and a SONET network. Ethernet links connect hosts to gateways and different types of gateways (GW2 to GW3 in Figure 4). Consider the Abilene network (Internet2 backbone) as an example of the MPLS network and CHEE- TAH as an example of the SONET network. For clarity, Figure 4 omits the network switches within these two networks. An example of an SDM-MPLS gateway (GW1 and GW2 in Figure 4) is a Cisco GSR or Juniper T640. An example of SDM- SONET gateway (GW3 and GW4 in Figure 4) is Sycamore s SN16000. Consider a scenario in which an Ethernet end host H1 with no VLAN capability requests an intra-ospf-area SDM (Ethernet) connection to end host H2. This request is sent from H1 to GW1 (Step 1 in Figure 4). Assume that at GW1, the H1-GW1 interface is programmed to operate in Ethernet over MPLS - Port Mode [17]. Assume GW2 has advertised IACD parameters for its Ethernet interfaces to GW3, indicating SDM as the upper-layer multiplexing capability and PSC as the lower-layer multiplexing capability. Similarly, assume GW3 and GW4 7

H1 GW1 PSC (MPLS) network, e.g., Abilene 4 2 SDM circuit, e.g., end-to-end Ethernet TDM (SONET) network, e.g, CHEETAH 8 6 5 1 9 14 3 13 GW2 12 GW3 Direct Ethernet link 7 11 GW4 GW1, GW2: SDM-MPLS gateway, e.g., Cisco GSR GW3, GW4: SDM-SONET gateway, e.g., Sycamore SN16000 1 The order of RSVP-TE signaling messages Figure 4. Example of a SDM-MPLS-SDM-TDM-SDM heterogeneous connection have advertised IACDs for their Ethernet interfaces with SDM for the upper-layer and TDM for the lower-layer. GW1 processes these IACD parameters from GW2, GW3, and GW4, and recognizes that it needs to set up a protocolencapsulated SDM-over-PSC connection to GW2. Therefore it initiates a lower-layer MPLS connection setup with the destination set to GW2 (Step 2). When this lower-layer MPLS connection is setup (Step 3), GW1 sends the upper-layer SDM connection setup request to GW2 (Step 4) indicating that the newly setup MPLS connection is to be used for this upperlayer SDM connection. Realizing that GW3 supports SDM multiplexing on its interfaces, GW2 simply sends the SDM connection setup request to GW3 (Step5). When the SDM connection setup request reaches GW3, it recognizes that it needs to set up a protocol-encapsulated SDM-over-TDM connection to GW4 using information in GW4 s IACD parameters. Therefore, it initiates a lower-layer TDM connection setup destined to GW4 (Steps 6 and 7), and then sends the upper-layer SDM connection setup request destined to H2 to GW3 (Step 8) indicating that the newly setup TDM connection is to be used for this upper-layer SDM connection. GW4 recognizes that the SDM connection should be routed to an outgoing SDM interface to the final destination host H2 (Step 9). Resv messages for the SDM connection are sent in the reverse direction (Steps 10-14). In this example, two protocol-encapsulated connection setup procedures were initiated by GW1 and GW3 to map all Ethernet frames from an incoming SDM connection to an outgoing MPLS connection and SONET connection, respectively. Recall our earlier statement on using the control plane to decide whether a connection is protocol-converted or protocolencapsulated when circuit-based multiplexing schemes, such as SDM, are involved. Since the PATH message in step one 10 H2 carried the H2 s address as the destination for the SDM connection and not GW1 s address, the SDM connection incoming at GW1 is protocol-encapsulated onto the MPLS connection. A similar explanation holds for the action performed at GW3. A practical implementation of an SDM-SONET-SDM connection across the CHEETAH network using a similar heterogeneous setup procedure is described in [15]. We do not show a protocol-converted connection setup procedure in this example. However, if a gateway product combining GW2 and GW3 functionality was available, it would advertise IACD parameters for its interfaces into the MPLS and SONET networks showing SDM as the upper-layer multiplexing capability and PSC and TDM as the lower-layer multiplexing capabilities, respectively. In this case, the gateway could initiate a lower-layer TDM connection setup directly to the remote gateway on the SONET network and map Ethernet frames extracted from the terminating MPLS virtual circuit on to the SONET circuit. This would be an example of a protocolconverted connection. The above discussion has been for intra-area heterogeneous connections. Less information will be available to upstream gateways about the switching and the adoption capability of downstream gateways for inter-area or inter-domain connections. We are currently studying this problem and expect our work to propose further enhancements to GMPLS protocols and procedures. V. CONCLUSIONS We draw two sets of conclusions in this paper. First, we conclude that Connection-Oriented (CO) networks equipped with GMPLS control-plane protocols are well matched to the guaranteed service quality, adaptability and scalability requirements of Grid computing. Second, we note two areas in which enhancements are required to current GMPLS protocols and implementations. On the protocol design side, we discuss the need to extend GMPLS control-plane protocols to support heterogeneous connections, given that heterogeneity is a desired attribute for Grid computing and that multiple GMPLS based CO networking technologies are currently available. On GMPLS implementations, we demonstrate that without a significant reduction in call processing delays, large-scale networks will be hard to create because of its impact on utilization. ACKNOWLEDGEMENTS We thank Anant Padmanath Mudambi, Tao Li, Murali Krishna Nethi, and Xiangfei Zhu, from the University of Vir- 8

ginia, and Haobo Wang and Ramesh Karri, from Polytechnic University, NY, for their help with this work. This work was carried out under the sponsorship of NSF ITR-0312376, NSF ANI-0335190, NSF ANI-0087487, and DOE DE-FG02-04ER25640 grants. REFERENCES [1] M. Baker, R. Buyya and D. Laforenza, Grids and Grid Technologies for Wide-Area Distributed Computing, Software - Practice and Experience, 2002, John Wiley and Sons. [2] Ian Foster, What is the Grid? A Three Point Checklist, http://wwwfp.mcs.anl.gov/~foster/articles/whatisthegrid.pdf, July 20, 2002. [3] R. Braden, L. Zhang, S. Berson, S. Herzog, and S. Jamin, Resource ReSerVation Protocol (RSVP)-Version 1 Functional Specification, IETF RFC 2205, September 1997. [4] E. Rosen, A. Viswanathan, and R. Callon, Multiprotocol Label Switching Architecture, IETF RFC 3031, January 2001. [5] IEEE, 802.1Q: Virtual Bridged Local Area Networks, IEEE Standard, May 2003. [6] J. Lang, Link Management Protocol (LMP), IETF RFC 4204, October 2005. [7] D. Katz, K. Kompella, and D. Yeung, Traffic Engineering (TE) Extensions to OSPF Version 2, IETF RFC 3630, September 2003. [8] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, and G. Swallow, RSVP-TE: Extensions to RSVP for LSP Tunnels, IETF RFC 3209, December 2001. [9] J. Sobieski, T. Lehman, B. Jabbari, Dynamic Resource Allocation through GMPLS Optical Networks (DRAGON), http://dragon.east.isi.edu/. [10] E. Bell, et al., Definitions of Managed Objects for Bridges with Traffic Classes, Multicast Filtering and Virtual LAN Extensions, IETF RFC- 2674, 1999. [11] D. Papadimitriou, M. Vigoureux, K. Shiomoto, D. Brungard and J.L. Roux, Generalized Multi-Protocol Label Switching (GMPLS) Protocol Extensions for Multi-Region Networks (MRN), IETF Internet Draft, February 2005. [12] S-Y Hwang, R. Riddle, Bandwidth Reservation for User Work (BRUW), http://people.internet2.edu/~bdr/talks/meetings/terena2005/ bod-bruw.doc, May 23, 2005. [13] C. Guok, ESnet On-demand Secure Circuits and Advance Reservation System (OSCARS), http://www.es.net/oscars/index.html. [14] X. Zheng, M. Veeraraghavan, N. S. V. Rao, Q. Wu, and M. Zhu, CHEE- TAH: Circuit-switched High-speed End-to-End Transport ArcHitecture testbed, IEEE Communication Magazine, vol. 43, Issue 8, pp. s11-s17, Aug. 2005. [15] X. Zhu, X. Zheng, M. Veeraraghavan, Z. Li, Q. Song, I. Habib, and N. S. V. Rao, Implementation of a GMPLS-based Network with End Host Initiated Signaling, submitted to ICC 2006. [16] H. Wang, M. Veeraraghavan, R. Karri, and T. Li, Design of a High-Performance RSVP-TE Signaling Hardware Accelerator, IEEE Journal on Selected Areas in Communication, 2005. [17] W. Luo, D. Bokotey, A. Chan, and C. Pignataro, Layer 2 VPN Architectures: Understanding Any Transport over MPLS, Cisco Press, May 12, 2005. 9