L19 Data Center Network Architectures by T.S.R.K. Prasad EA C451 Internetworking Technologies 27/09/2012
References / Acknowledgements [Feamster-DC] Prof. Nick Feamster, Data Center Networking, CS6250: Computer Networking, Fall 2011. [Al-Fares] [Hamilton] Mohammad Al-Fares, Alexander Loukissas and Amin Vahdat, A Scalable, Commodity Data Center Network Architecture, SIGCOMM-2008. James Hamilton, An Architecture for Modular Data Centers, CIDR 2007. [Benson] Theophilus Benson, Aditya Akella and David A. Maltz, Network Traffic Characteristics of Data Centers in the Wild, IMC-2010. References
Optional Readings [Gill] Phillipa Gill, Navendu Jain and Nachiappan Nagappan, Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications, SIGCOMM-2011. [Cisco-DCI] Cisco Data Center Infrastructure 2.5 Design Guide, 2007. [Greenberg] Albert Greenberg, James Hamilton, David A. Maltz, Parveen Patel, The Cost of a Cloud: Research Problems in Data Center Networks. Optional Reading
Self-Study [Greenberg2] Albert Greenberg, Parantap Lahiri, David A. Maltz, Parveen Patel, Sudipta Sengupta, Towards a Next Generation Data Center Architecture: Scalability and Commoditization, Presto 2008. Self-Study
Presentation Overview Traffic Characteristics Architectural Improvements Basic Architecture Motivation Lecture Outline
Presentation Overview Traffic Characteristics Architectural Improvements Basic Architecture Motivation Lecture Outline
Cloud Computing Elastic resources Expand and contract resources Pay-per-use Infrastructure on demand Multi-tenancy Multiple independent users Security and resource isolation Amortize the cost of the (shared) infrastructure Flexibility service management Resiliency: isolate failure of servers and storage Workload movement: move work to other locations Motivation Cloud Computing
Trend of Data Center By J. Nicholas Hoover, InformationWeek June 17, 2008 04:00 AM http://gigaom.com/cloud/google-builds-megadata-center-in-finland/ 200 million Euro http://www.datacenterknowledge.com Data centers will be larger and larger in the future cloud computing era to benefit from commodities of scale. Tens of thousand Hundreds of thousands in # of servers Most important things in data center management - Economies of scale - High utilization of equipment - Maximize revenue - Amortize administration cost - Low power consumption (http://www.energystar.gov/ia/partners/prod_development/downloads/epa_datacenter_report_congress_final1.pdf) Motivation Trend of Data Center
Cloud Service Models Software as a Service (Saas) Provider licenses applications to users as a service e.g., customer relationship management, email, Avoid costs of installation, maintenance, patches, Platform as a Service (PaaS) Provider offers software platform for building applications e.g., Google s App-Engine Avoid worrying about scalability of platform Infrastructure as a Service (IaaS) Provider offers raw computing, storage, and network e.g., Amazon s Elastic Computing Cloud (EC2) Avoid buying servers and estimating resource needs Motivation Cloud Service Models
Multi-Tier Applications Applications consist of tasks Many separate components Running on different machines Commodity computers Many general-purpose computers Not one big mainframe Easier scaling Front end Server Aggregator Aggregator Aggregator Aggregator Worker 10 Worker Worker Worker Worker Motivation Multi-Tier Applications
Enabling Technology: Virtualization Multiple virtual machines on one physical machine Applications run unmodified as on real machine VM can migrate from one computer to another Motivation Virtualization
Status Quo: Virtual Switch in Server Motivation vswitch
Presentation Overview Traffic Characteristics Architectural Improvements Basic Architecture Motivation Lecture Outline
Common Data Center Topology Core Internet Layer-3 router Data Center Aggregation Layer-2/3 switch Access Layer-2 switch Servers Basic Architecture Common DC Topology
Top-of-Rack (ToR) Architecture Rack of servers Commodity servers And top-of-rack switch Modular design Preconfigured racks Power, network, and storage cabling Aggregate to the next level Basic Architecture ToR Architecture
Modularity Containers Many containers Basic Architecture Modularity
Data Center Network Topology Basic Architecture DC Network Topology
Problems with Common Topologies Single point of failure Over subscription of links higher up in the topology Tradeoff between cost and provisioning Basic Architecture Problems with Common Topology
Requirements for future data center To catch up with the trend of mega data center, DCN technology should meet the requirements as below High Scalability Transparent VM migration (high agility) Easy deployment requiring less human administration Efficient communication Loop free forwarding Fault Tolerance Current DCN technology can t meet the requirements. Layer 3 protocol can not support the transparent VM migration. Current Layer 2 protocol is not scalable due to the size of forwarding table and native broadcasting for address resolution. Basic Architecture Future Requirements
Capacity Mismatch Basic Architecture Capacity Mismatch
Data-Center Routing Basic Architecture DC Routing
Reminder: Layer 2 vs. Layer 3 Ethernet switching (layer 2) Cheaper switch equipment Fixed addresses and auto-configuration Seamless mobility, migration, and failover IP routing (layer 3) Scalability through hierarchical addressing Efficiency through shortest-path routing Multipath routing through equal-cost multipath So, like in enterprises Data centers often connect layer-2 islands by IP routers Basic Architecture L2 vs L3
Need for Layer 2 Certain monitoring apps require server with same role to be on the same VLAN Using same IP on dual homed servers Allows organic growth of server farms Migration is easier Basic Architecture Need for L2
Review of Layer 2 & Layer 3 Layer 2 One spanning tree for entire network Prevents loops Ignores alternate paths Layer 3 Shortest path routing between source and destination Best-effort delivery Basic Architecture Review of L2 vs L3
Data Center Challenges Traffic load balance Support for VM migration Achieving bisection bandwidth Power savings / Cooling Network management (provisioning) Security (dealing with multiple tenants) Basic Architecture DC Challenges
Data Center Costs (Monthly Costs) Servers: 45% CPU, memory, disk Infrastructure: 25% UPS, cooling, power distribution Power draw: 15% Electrical utility costs Network: 15% Switches, links, transit http://perspectives.mvdirona.com/2008/11/28/costofpowerinlargescaledatacenters.aspx Basic Architecture Dc Costs
Presentation Overview Traffic Characteristics Architectural Improvements Basic Architecture Motivation Lecture Outline
FAT Tree-Based Solution Connect end-host together using a fat-tree topology Infrastructure consist of cheap devices Each port supports same speed as endhost All devices can transmit at line speed if packets are distributed along existing paths A k-port fat tree can support k 3 /4 hosts Naming controversy Architectural Improvements FAT Tree-Based Solution
Fat-Tree Topology Architectural Improvements Fat-Tree Topology
DC Architecture with Load Balancers (LB) Architectural Improvements DC Architecture with LBs
Monsoon Architecture Architectural Improvements Monsoon Architecture
Monsoon Architecture Aggregation and access layers form a full mesh topology Architectural Improvements Monsoon Architecture
Presentation Overview Traffic Characteristics Architectural Improvements Basic Architecture Motivation Lecture Outline
The Importance of Speed A 1-millisecond advantage in trading applications can be worth $100 million a year to a major brokerage firm Traffic Characteristics
Dataset: Data Centers Studied 10 data centers 3 classes Universities Private enterprise Clouds Internal users Univ/priv Small Local to campus External users Clouds Large Globally diverse Traffic Characteristics Dataset DC Role DC Name Location Universities EDU1 US-Mid 22 Private Enterprise Commercial Clouds EDU2 US-Mid 36 EDU3 US-Mid 11 PRV1 US-Mid 97 PRV2 US-West 100 CLD1 US-West 562 CLD2 US-West 763 CLD3 US-East 612 CLD4 S. America 427 CLD5 S. America 427 Number Devices
Canonical Data Center Architecture Capture Points Core (L3) Aggregation (L2) SNMP & Topology From ALL Links Edge (L2) Top-of-Rack Packet Sniffers Application servers Traffic Characteristics Capture Points
Traffic by Applications 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% PRV2_1 PRV2_2 PRV2_3 PRV2_4 EDU1 EDU2 EDU3 Differences between various bars Clustering of applications PRV2_2 hosts secured portions of applications PRV2_3 hosts unsecure portions of applications AFS NCP SMB LDAP HTTPS HTTP OTHER Traffic Characteristics Traffic by Applications
Analyzing Packet Traces Transmission patterns of the applications ON-OFF traffic at edges Binned in 15 and 100 m. secs Traffic Characteristics Analyzing Packet Traces
Data-Center Traffic is Bursty Understanding arrival process Range of acceptable models Data Center Off Period Dist ON periods Dist Inter-arrival Dist What is the arrival process? Heavy-tail for the 3 distributions ON, OFF times, Inter-arrival, Lognormal across all data centers Prv2_1 Lognormal Lognormal Lognormal Prv2_2 Lognormal Lognormal Lognormal Prv2_3 Lognormal Lognormal Lognormal Prv2_4 Lognormal Lognormal Lognormal EDU1 Lognormal Weibull Weibull EDU2 Lognormal Weibull Weibull EDU3 Lognormal Weibull Weibull Different from Pareto of WAN Need new models Traffic Characteristics DC Traffic is Bursty 39
Packet Size Distribution Bimodal (200B and 1400B) Small packets TCP acknowledgements Keep alive packets Persistent connections important to apps Traffic Characteristics Packet Size Distribution
Intra-Rack Versus Extra-Rack Results 100 90 80 70 60 50 40 30 20 10 0 EDU1 EDU2 EDU3 PRV1 PRV2 CLD1 CLD2 CLD3 CLD4 CLD5 Extra-Rack Inter-Rack Clouds: most traffic stays within a rack (75%) Colocation of apps and dependent components Other DCs: > 50% leaves the rack Un-optimized placement Traffic Characteristics Traffic Destination
Observations from Interconnect Links utils low at edge and agg Core most utilized Hot-spots exists (> 70% utilization) < 25% links are hotspots Loss occurs on less utilized links (< 70%) Implicating momentary bursts Time-of-Day variations exists Variation an order of magnitude larger at core Traffic Characteristics Observations from Interconnect
Argument for Larger Bisection Need for larger bisection VL2 [Sigcomm 09], Monsoon [Presto 08],Fat-Tree [Sigcomm 08], Portland [Sigcomm 09], Hedera [NSDI 10] Congestion at oversubscribed core links Increase core links and eliminate congestion Traffic Characteristics Argument for Larger Bisection
Calculating Bisection Demand Core Aggregation Edge Application servers Bisection Links (bottleneck) App Links If Σ traffic (App ) > 1 then more device are Σ capacity(bisection needed at the bisection Traffic Characteristics Calculating Bisection Demand
Bisection Demand Given our data: current applications and DC design NO, more bisection is not required Aggregate bisection is only 30% utilized Need to better utilize existing network Load balance across paths Migrate VMs across racks Traffic Characteristics Bisection Demand
Insights Gained 75% of traffic stays within a rack (Clouds) Applications are not uniformly placed Half packets are small (< 200B) Keep alive integral in application design At most 25% of core links highly utilized Effective routing algorithm to reduce utilization Load balance across paths and migrate VMs Questioned popular assumptions Do we need more bisection? No Is centralization feasible? Yes Traffic Characteristics Insights Gained