Native Ethernet transmission beyond the LAN Cătălin Meiroşu CERN and Politehnica Bucureşti TERENA 05 Poznań
Overview 10 GE, a distance independent Ethernet The WAN PHY long haul Ethernet and SONET-friendly theory and results Survey of Ethernet over transport, compared to WAN PHY or suits vs. t-shirts, round #2 LAN everywhere? 2
Distance-independent Ethernet IEEE 802.3ae-2002 Shared copper medium Distance-limited - because of CSMA/CD Limited speed No CSMA/CD Full duplex Point to point Switched architecture 10 Gbit/s 40 km over fibre WAN access Original drawing by Bob Metcalfe 3
LAN PHY over dark fibre The 10 GE LAN PHY includes a 1550nm laser That means optical amplifiers can, in principle, be used to increase the point to point span If you happen to own dark fibre or find a compliant carrier Practical results: 550km using XENPAK optics, ESTA project See paper by Olesen et al., COIN 2004 Server PCs 1 GE BATM T6 Lyngby 10GE LAN 121 KM 93 KM 38 KM 62 KM 84 KM 75 KM 52 KM 10GE LAN BATM T6 1 GE Server PCs DCF EDFA Aalborg 4
The 10 GE WAN PHY 1/2 What happens when distance bigger than ~600 km Have to regenerate, but the 3R regenerators deployed require SONET framing 10GE introduces a gateway from LAN to the WAN by means of the WAN PHY Compatible with existing regenerators Transmission rate: SONET OC192c/STM64c Encapsulation STS-192c/VC-4-64c Partial use of the management bits of the SONET/SDH frame Direct connection to DWDM and SONET/SDH possible as demonstrated by the ESTA project Router OC192 ELTE WAN PHY LTE 3R 3R 3R WAN 10GE switch/router traditional novel LTE ELTE WAN PHY OC192 Router 10GE switch/router 5
The 10 GE WAN PHY 2/2 The 802.3ae standard says: Optical jitter shall be measured according to IEEE specifications ANSI and ITU-T use a different method for the SONET/SDH optical jitter Transmit clock accuracy shall be as good as the lowest acceptable limit, from the long-haul infrastructure point of view that is 20ppm degraded service SONET clock for brand new oscillators, the ESTA project measured 0.1 ppm between a WAN PHY clock and a STRATUM-3 SONET clock Cheaper crystals expected to drift more with age SONET/SDH line timing not supported What did equipment manufacturers do? use SONET optics and oscillators some implemented line timing Warning: a status quo on the market today does not mean it will continue in the future! ITU-T recommendation G.8012: WAN PHY can be deployed as NNI or UNI when its clock accuracy is +/- 4.6ppm or better (e.g. at least STRATUM-3 equivalent) 6
WAN PHY testbed Geneva - Tokyo Tokyo Geneva Server PC Foundry NI40G ONS 15454 ONS 15454 OME 6500 OME 6500 ONS 15454 Foundry NI40G 10GE 10GE 10 GE WAN OC192 Seattle Chicago Amsterdam OC192 10 GE WAN Server PC IEAAF/WIDE CANARIE SURFnet PNW October November 2004 7
Results single stream TCP Opteron1 Chelsio T110 Tokyo 10GE LAN NetIron40G Iperf 8192-bytes jumbo frames 512MB TCP window size WAN PHY RTT 262 ms WAN PHY NetIron40G Geneva 10GE LAN Opteron2 Chelsio T110 8
Buffering LAN PHY to WAN PHY Memory requirements for the adaptation of continuous traffic bursts even on a single-stream environment, TCP my send bursts up to 1 RTT in length TCP bandwidth 9
The optical module format (jungle?) Optical ports in Gigabit Ethernet switches may use modular optics The 10 Gigabit Ethernet equipment followed the tradition XENPAK First generation of modular optics Implements the PCS/PMA/PMD from the Ethernet standard 1310, 1550 and 850nm optics Quite big and power hungry module, SC connector for the fibre XPAK and X2 modules improve on the form factor and power consumption, but support only 850nm and 1310nm lasers (to date) XFP Third generation of modular optics Common specifications for OC192, 10 G Fibre Channel, G709 and 10 GE Only implements the PMD in 10GE Requires a SERDES chip on the line card of the switch LC fibre connector Targeted at 850 and 1310nm optics Opens the possibility having software-configurable LAN/WAN PHY switch ports The 802.3ae LAN model MAC Media Access Control Reconciliation 10GBASE-R PCS Physical Coding Sublayer PMA Physical Medium Attachment Medium XGMII WIS WAN Interface Sublayer PMD Physical Medium Dependent MDI Only exists in 10GBASE-W Specified by both 10GBASE-R and 10GBASE-W 10
Ethernet topology long haul Ethernet transmits at 10Gbit/s and has direct WAN interface But still has limitations for long-distance deployments (few examples below) VLAN scalability MAC address table explosion Ongoing work to address limitations in the IEEE 802.1 working group Q-in-Q encapsulation addresses the VLAN scalability issue MAC-in-MAC encapsulation addresses the MAC address table explosion problem More work in ITU-T SG13 and MEF to define services and interfaces Use of Link Aggregation (IEEE 802.1ad)? bundles either LAN PHY or WAN PHY to provide redundancy or just more bandwidth Maximum number of aggregated links still depends on the manufacturer; usually 4 to 16, e.g. 40 160 Gbit/s Automatic load balancing or hot standby, with automatic traffic rerouting Time for re-routing: depends how fast the MAC layer figures out that the link is down (worst case: RTT) 11
SDH framing overview Path Overhead Structure of an STM-N frame Two methods to organise the Virtual Container for dataaware services (examples for an STM-64 frame, ITU-T recommendation G707 12/2003) Contiguous concatenation: VC-4-64c for a user payload of 9.584 Gbit/s Virtual concatenation: VC-4-Xv, where X=1..64 Sum of X individual VC-4 containers, each containing 261 columns (1 column of path overhead, 260 columns of user payload hence a data rate 149.760 Mbit/s) Each VC-4 part of a VC-4-Xv may be transmitted over a different path 12
Ethernet over SDH Generic Framing Procedure (ITU-T G.7041) Method for encapsulating variable length payload of client signals for transfer over SDH and OTN Minimum 4 bytes overhead Maximum client payload 65531 bytes Defines Ethernet over GFP Frame-mapped mode Preamble and start-of-frame bits removed from the Ethernet frame Keeps the Ethernet FCS (CRC) field The IPG is removed But SONET/SDH is synchronous so what happens during IPG? Transparent mode (for Gigabit Ethernet over fibre signals) Link Capacity Adjustment (ITU-T G.7042) Defines a protocol for dynamically adjusting the capacity of a container Throughput of GE over GFP VC-4-6V 898560 kbit/s VC-4-7v 1048320 kbit/s 64 B 98.8 % 1024 B 90.5 % 1518 B 90.3 % 100% 100% 100% Throughput of 10GE over GFP VC-4-64V 9584640 kbit/s VC-4-66V 9884160 kbit/s 64 B 1024 B 1518 B 100% 97% 96.7% 100% 98.9% 98.9% 13
The suits vs. the t-shirts First round took place in 1996-1998: ATM vs. Fast/Gigabit Ethernet in the LAN we all know who won But the t-shirts are entering the suits own field now Gigabit and 10 Gigabit Ethernet began spreading into the MAN Because of dark fibre? 10 GE is WAN-ready Examples of limitations to be addressed, before Ethernet becomes a viable alternative topology on the WAN Lack of carrier-grade OAM features IEEE 802.3ah-2004 EFM introduced a subset Network topology: spanning tree, broadcast storms Ethernet over GFP long-distance transport technology, turned Ethernet-friendly while still providing carrier-grade OAM circuit protection bandwidth guarantees 14
LAN everywhere? Ethernet everywhere means A 100000 node flat Ethernet network? Or adopting the same strategy as within the enterprise today: partition the network into IP subnets Routers have Ethernet ports! No more broadcast storms Ethernet as a pure link layer is distance unlimited and friendly to the installed base of long haul infrastructure Higher layer Ethernet features may be used User authentication based on 802.1x The 802.1ad link aggregation 15
Conclusion The ESTA EU project pioneered 10 GE demos over long distance networks: dark fibre, DWDM and SONET/SDH Native Ethernet can be used on the long haul to complement routed networks, layer 2 topology (if it scales to the requirements) provide the link layer for an IP network Suits vs. t-shirts round #2 just started the suits citadel is under attack 16
Acknowledgements The ESTA project (IST-2001-33182): Alexis Lestra, Brian Martin, Bob Dobinson The carriers: DARENET, SURFnet, CANARIE, PNW, WIDE LAN equipment: Foundry Networks Prof. Kei Hiraki, University of Tokyo Wade Hong, Carleton University, Ottawa 17