Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades

Similar documents
LEGUP: Using Heterogeneity to Reduce the Cost of Data Center Network Upgrades

LEGUP: Using Heterogeneity to Reduce the Cost of Data Center Network Upgrades

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff]

Jellyfish: Networking Data Centers Randomly

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Small-World Datacenters

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

Virtualization of Data Center towards Load Balancing and Multi- Tenancy

SPAIN: High BW Data-Center Ethernet with Unmodified Switches. Praveen Yalagandula, HP Labs. Jayaram Mudigonda, HP Labs

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Lecture 3: Topology - II

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Random Regular Graph & Generalized De Bruijn Graph with k-shortest Path Routing

DATA centers run applications from online services such

Adding Capacity Points to a Wireless Mesh Network Using Local Search

Venice: Reliable Virtual Data Center Embedding in Clouds

FCell: Towards the Tradeoffs in Designing Data Center Network Architectures

Slim Fly: A Cost Effective Low-Diameter Network Topology

GIAN Course on Distributed Network Algorithms. Network Topologies and Interconnects

The Impact of Optics on HPC System Interconnects

Building Mega Data Center from Heterogeneous Containers

Data Center Network Topologies II

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013

DevoFlow: Scaling Flow Management for High-Performance Networks

Space Shuffle: A Scalable, Flexible, and High-Bandwidth Data Center Network

SPARTA: Scalable Per-Address RouTing Architecture

Research Article Optimized Virtual Machine Placement with Traffic-Aware Balancing in Data Center Networks

Space Shuffle: A Scalable, Flexible, and High-Bandwidth Data Center Network

Chapter 3 : Topology basics

GC-HDCN: A Novel Wireless Resource Allocation Algorithm in Hybrid Data Center Networks

arxiv: v2 [cs.ni] 12 Feb 2014

Data Center Network Topologies

Datacenters. Mendel Rosenblum. CS142 Lecture Notes - Datacenters

A Scalable, Commodity Data Center Network Architecture

arxiv: v3 [cs.ni] 9 Feb 2015

On the Design and Analysis of Data Center Network Architectures for Interconnecting Dual-Port Servers. Dawei Li and Prof. Jie Wu Temple University

Multi-resource Energy-efficient Routing in Cloud Data Centers with Network-as-a-Service

A Cost and Scalability Comparison of the Dragonfly versus the Fat Tree. Frank Olaf Sem-Jacobsen Simula Research Laboratory

Taming the Flying Cable Monster: A Topology Design and Optimization Framework for Data-Center Networks

Copyright 2000, Kevin Wayne 1

Cloud 3.0 and Software Defined Networking October 28, Amin Vahdat on behalf of Google Technical Infratructure Google Fellow

Pricing Intra-Datacenter Networks with

Chapter 4 : Butterfly Networks

Towards Optimal- Performance Datacenters

Optimization in Data Centres. Jayantha Siriwardana and Saman K. Halgamuge Department of Mechanical Engineering Melbourne School of Engineering

Congestion Control of MPTCP: Performance Issues and a Possible Solution

Topologies. Maurizio Palesi. Maurizio Palesi 1

GRIN: Utilizing the Empty Half of Full Bisection Networks

Rack-Level I/O Consolidation with Cisco Nexus 5000 Series Switches

Reducing Network Tiers Flattening the Network. Kevin Ryan Director Data Center Solutions

Deploying 10 Gigabit Ethernet with Cisco Nexus 5000 Series Switches

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing

Network-on-chip (NOC) Topologies

Navigating the Pros and Cons of Structured Cabling vs. Top of Rack in the Data Center

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

Components of Data Center Environment

Topologies. Maurizio Palesi. Maurizio Palesi 1

Using Mobile Relays to Prolong the Lifetime of Wireless Sensor Networks. Wang Wei Vikram Srinivasan Chua Kee-Chaing

Trends in Data Centre

Diamond: Nesting the Data Center Network with Wireless Rings in 3D Space

Performing MapReduce on Data Centers with Hierarchical Structures

Distributed Core Architecture Using Dell Networking Z9000 and S4810 Switches

Advanced Computer Networks Data Center Architecture. Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015

Lecture 16: Data Center Network Architectures

Routing Algorithms. Review

Robust validation of network designs under uncertain demands and failures

CS 6453 Network Fabric Presented by Ayush Dubey

Guaranteeing Heterogeneous Bandwidth Demand in Multitenant Data Center Networks

Algorithm and Complexity of Disjointed Connected Dominating Set Problem on Trees

Application Provisioning in Fog Computingenabled Internet-of-Things: A Network Perspective

Network Design Considerations for Grid Computing

Impact of Ethernet Multipath Routing on Data Center Network Consolidations

Scalable Data Center Multicast. Reporter: 藍于翔 Advisor: 曾學文

Competitive and Deterministic Embeddings of Virtual Networks

Driving Data Warehousing with iomemory

Theia: ˇ Networking for Ultra-Dense Data Centers

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Expansible and Cost-Effective Network Structures for Data Centers Using Dual-Port Servers

LAN design. Chapter 1

An Edge-Swap Heuristic for Finding Dense Spanning Trees

Cloud networking (VITMMA02) DC network topology, Ethernet extensions

NaaS Network-as-a-Service in the Cloud

Micro load balancing in data centers with DRILL

A Game-Theoretic Framework for Congestion Control in General Topology Networks

Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710

Optical Interconnection Networks in Data Centers: Recent Trends and Future Challenges

Interconnection networks

170 Index. Delta networks, DENS methodology

A LAYMAN S EXPLANATION OF THE ROLE OF IT RACKS IN COOLING YOUR DATA CENTER

Network Traffic Characteristics of Data Centers in the Wild. Proceedings of the 10th annual conference on Internet measurement, ACM

Outlines. Introduction (Cont d) Introduction. Introduction Network Evolution External Connectivity Software Control Experience Conclusion & Discussion

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

FlatNet: Towards A Flatter Data Center Network

Coflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan

Optimal Cache Allocation for Content-Centric Networking

Dynamic Routing and Resource Allocation in a Elastic Optical Network Using Learning Algorithms. Tanjila Ahmed

3 No-Wait Job Shops with Variable Processing Times

Online Strategies for Intra and Inter Provider Service Migration in Virtual Networks

Transcription:

Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades Andy Curtis joint work with: S. Keshav Alejandro López-Ortiz Tommy Carpenter Mustafa Elsheikh University of Waterloo

Motivation Data centers critical part of IT infrastructure Expensive Data centers change over time - $1000/year/server -

Data centers constantly evolve - 63% of Data Center Knowledge readers are either in the midst of data center expansion projects or have just completed a new facility - 59% continue to build and manage their data centers inhouse http://www.datacenterknowledge.com/archives/2010/08/16/data-center-industry-expansion-in-full-swing/

Network upgrade motivation

Network upgrade motivation Several prior solutions for greenfield data centers - VL2, flattened butterfly, HyperX, BCube, DCell, Al-Fares et al., MDCube

Network upgrade motivation Several prior solutions for greenfield data centers - VL2, flattened butterfly, HyperX, BCube, DCell, Al-Fares et al., MDCube What about legacy data centers?

Existing topologies are not flexible enough

Existing topologies are not flexible enough

Existing topologies are not flexible enough?

Goal It should be easy and cost-effective to add capacity to a data center network

Challenging problem Designing a data center expansion or upgrade isn t easy - Huge design space - Many constraints

Problem 1 It s hard to analyze and understand heterogeneous topologies Problem 2 How to design an upgraded topology?

Problem 1 High performance network topologies are based on rigid constructions - Homogeneous switches - Prescribed switch radix - Single link rate

Problem 1 High performance network topologies are based on rigid constructions - Homogeneous switches - Prescribed switch radix - Single link rate Solutions: 1. develop theory of heterogeneous Clos networks 2. explore unstructured data center network topologies

Two solutions: LEGUP: output is a heterogeneous Clos network [Curtis, Keshav, López-Ortiz; CoNEXT 2010] REWIRE: designs unstructured DCN topologies [Curtis et al.; INFOCOM 2012]

Two solutions: LEGUP: output is a heterogeneous Clos network [Curtis, Keshav, López-Ortiz; CoNEXT 2010] REWIRE: designs unstructured DCN topologies [Curtis et al.; INFOCOM 2012]

LEGUP in brief: LEGUP designs upgraded/expanded networks for legacy data center networks

LEGUP in brief: LEGUP designs upgraded/expanded networks for legacy data center networks Input... Budget Existing network topology List of switches & line cards Optional: data center model...

LEGUP in brief: LEGUP designs upgraded/expanded networks for legacy data center networks............ Input Output

LEGUP in brief: LEGUP designs upgraded/expanded networks for legacy data center networks............ Input Output

LEGUP in brief: LEGUP designs upgraded/expanded networks for legacy data center networks............ Input Output Difficult optimization problem

Difficult optimization problem First pass: limit solution space by finding only heterogeneous Clos networks

Clos networks This is a physical realization of a Clos network Internet Core... Aggregation... ToR

Clos networks We can find a logical topology for this network 16 16 4 4 4 4 4 4 4 4

Heterogeneous Clos networks Logical topology is a forest 8 8 8 8 2 2

Theoretical contributions *optimal = uses same link capacity an equivalent stage Clos network

Theoretical contributions Lemma 1: How to construct all optimal logical forests for a set of switches *optimal = uses same link capacity an equivalent stage Clos network

Theoretical contributions Lemma 1: How to construct all optimal logical forests for a set of switches Lemma 2: How to build a physical realization from a logical forest *optimal = uses same link capacity an equivalent stage Clos network

Theoretical contributions Lemma 1: How to construct all optimal logical forests for a set of switches Lemma 2: How to build a physical realization from a logical forest Theorem: A characterization of heterogeneous Clos networks *optimal = uses same link capacity an equivalent stage Clos network

Theoretical contributions Lemma 1: How to construct all optimal logical forests for a set of switches Lemma 2: How to build a physical realization from a logical forest Theorem: A characterization of heterogeneous Clos networks This is the first optimal heterogeneous topology *optimal = uses same link capacity an equivalent stage Clos network

Problem 1 It s hard to analyze and understand heterogeneous topologies more later... Problem 2 How to design an upgraded topology?

Problem 1 It s hard to analyze and understand heterogeneous topologies Problem 2 How to design an upgraded topology? heterogeneous Clos

Problem 2 Upgraded network should: Maximize performance, minimize cost Be realized in the target data center Incorporate existing network equipment if it makes sense Approach: use optimization

LEGUP algorithm Branch and bound search of solution space - Heuristics to map switches to a rack See paper for details Time is bottleneck in algorithm - Exponential in number of switch types and (worst-case) in number ToRs - 760 server data center: 5 10 minutes to run algorithm - 7600 server data center: 1 2 days - But can be parallelized

LEGUP summary Developed theory of heterogeneous Clos networks Implemented LEGUP design algorithm On our data center, we see substantial cost savings: spend less than half as much money as a fat-tree for same performance

Two solutions: LEGUP: output is a heterogeneous Clos network [Curtis, Keshav, López-Ortiz; CoNEXT 2010] REWIRE: designs unstructured DCN topologies [Curtis et al.; INFOCOM 2012]

Can we do better with unstructured networks? 52 8 20 28 39 32 27 11 70 36 23 30 47 64 13 41 53 0 24 5 68 72 69 29 6 48 51 31 22 77 43 21 10 46

Problem Now we have an even harder network design problem

Problem Now we have an even harder network design problem Approach Use local search heuristics to find a good enough solution

REWIRE Uses simulated annealing to find a network that: - Maximizes performance Subject to: - The budget - Physical constraints of the data center model (thermal, power, space) - No topology restrictions

REWIRE Uses simulated annealing to find a network that: - Maximizes performance Bisection bandwidth - Diameter Subject to: - The budget - Physical constraints of the data center model (thermal, power, space) - No topology restrictions

REWIRE Uses simulated annealing to find a network that: - Maximizes performance Subject to: - The budget - Physical constraints of the data center model (thermal, power, space) - No topology restrictions Costs = new cables + moved cables + new switches

Simulated annealing algorithm At each iteration, computes - Performance of candidate solution - If accept this solution, then Compute next neighbor to consider

Simulated annealing algorithm At each iteration, computes - Performance of candidate solution - If accept this solution, then No known algorithm to find the bisection bandwidth of an Compute next neighbor to consider arbitrary network!

Bisection bandwidth computation Easy for a single cut

Bisection bandwidth computation S S

Bisection bandwidth computation bw(s,s ) = link cap(s,s ) min { server rates(s), server rates(s ) } S S

Bisection bandwidth computation bw(s,s ) = 4 min { 2, 6 } S S

Bisection bandwidth computation Then bisection bandwidth is the min over all cuts S S

Bisection bandwidth computation Easy on tree-like topologies because there are O(n) cuts

Bisection bandwidth computation Easy on tree-like topologies because there are O(n) cuts

Bisection bandwidth computation Easy on tree-like topologies because there are O(n) cuts

Bisection bandwidth computation Easy on tree-like topologies because there are O(n) cuts

Bisection bandwidth computation

Bisection bandwidth computation Exponentially many cuts on arbitrary topologies

Bisection bandwidth computation Exponentially many cuts on arbitrary topologies Need: A min-cut, max-flow type theorem for multicommodity flow s t

Bisection bandwidth computation Need: A min-cut, max-flow type theorem for multicommodity flow s 1 t 1 s 2 t 2 s 3

Bisection bandwidth computation

Bisection bandwidth computation Theorem [Curtis and López-Ortiz, INFOCOM 2009]: A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth a sum dependent on αi for all nodes i

Bisection bandwidth computation Theorem [Curtis and López-Ortiz, INFOCOM 2009]: A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth a sum dependent on αi for all nodes i We can compute the αi values using linear programming [Kodialam et al. INFOCOM 2006]

Bisection bandwidth computation Theorem [Curtis and López-Ortiz, INFOCOM 2009]: A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth a sum dependent on αi for all nodes i We can compute the αi values using linear programming [Kodialam et al. INFOCOM 2006] These two theoretical results give us a polynomial-time algorithm to find the bisection bandwidth of an arbitrary network

Evaluation How much performance do we gain with heterogeneous network equipment?

Evaluation U of Waterloo School of Computer Science data center as input Three scenarios: - Upgrading the network (see paper) - Expansion by adding servers - Greenfield data center

Evaluation: input SCS data center topology - 19 edge switches, 760 servers - Heterogeneous edge switches - All aggregation switches are HP 5406 models......

Evaluation: input The data center handles air poorly. So, we add thermal constraints modeling this Cold/hot aisle Chiller Cold aisle airflow Hot aisle

Evaluation: cost model 1 Gb ports 10 Gb ports Watts Cost ($) 24 100 250 48 150 1,500 48 4 235 5,000 24 300 6,000 48 600 10,000 144 5000 75,000 Rate Short ($) Medium ($) Long ($) 1 Gb 5 10 20 10 Gb 50 100 200 Install cost 10 20 50

Evaluation: comparison methods Generalized fat-tree - Bounded best-case performance Greedy algorithm - Finds link addition that improves performance the most, adds it, and repeats Random graph - Proposed by Singla et al., HotCloud 2011 as data center network topology

Expanding the Waterloo SCS data center 0.05 Oversubscription Bisection bandwidth ratio 0.04 0.03 0.02 0.01 0 Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 Original Fat-tree 1Gb Starting servers = 760 GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE 0 160 320 480 640 Cumulative number of servers added

Expanding the Waterloo SCS data center 0.05 Oversubscription Bisection bandwidth ratio 0.04 0.03 0.02 0.01 0 Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 Original Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP 0 160 320 480 640 Cumulative number of servers added REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE

Expanding the Waterloo SCS data center 0.05 Bisection Oversubscription bandwidth ratio 0.04 0.03 0.02 0.01 0 Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 Original Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP 0 160 320 480 640 Cumulative number of servers added REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE

Greenfield network design 1920 servers Edges switches have 48 gigabit ports - Assume 24 servers per rack

Greenfield network design 0.4 Oversubscription ratio 0.3 0.2 0.1 0 Diameter: 4 4 0 4 4 3 4 3 4 3 4 3 4 2 Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Budget = $125/rack $250/rack $500/rack $1000/rack

Greenfield network design 0.4 Oversubscription ratio 0.3 0.2 0.1 0 Diameter: 4 4 0 4 4 3 4 3 4 3 4 3 4 2 Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Budget = $125/rack $250/rack $500/rack $1000/rack

17 78 36 4 77 56 49 43 31 28 76 60 44 34 20 75 57 32 27 21 74 16 7 73 63 41 37 12 72 68 71 42 39 70 69 47 22 48 29 23 6 30 24 67 45 66 3 65 61 19 64 59 40 62 46 25 2 9 55 50 10 58 26 0 33 11 54 38 1 13 53 35 52 51 5 18 8 15

79 48 44 36 25 15 12 8 4 3 78 64 49 47 46 42 39 29 17 77 54 38 37 34 31 24 14 76 72 69 63 40 6 75 43 30 16 74 41 27 22 19 11 7 73 62 45 5 70 32 71 26 0 35 33 68 58 52 21 67 66 65 56 2 59 51 61 50 1 60 10 28 57 9 20 55 53 18 23 13

Greenfield network design Expanding a greenfield network 1600 servers initially - Grow by increments of 400 servers (10 racks) - $6000/rack budget

Expanding a greenfield network 0.40 0.35 Oversubscription Bisection bandwidth ratio 0.30 0.25 0.20 0.15 0.10 0.05 0 Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE 1600 2000 2400 2800 3200 Total servers in data center

Expanding a greenfield network 0.40 0.35 Oversubscription Bisection bandwidth ratio 0.30 0.25 0.20 0.15 0.10 0.05 0 Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE 1600 2000 2400 2800 3200 Total servers in data center

Expanding a greenfield network 0.40 0.35 Oversubscription Bisection bandwidth ratio 0.30 0.25 0.20 0.15 0.10 0.05 0 Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE 1600 2000 2400 2800 3200 Total servers in data center

Are unstructured topologies worth it? Higher performance - Up to 10x more bisection bandwidth than heterogeneous Clos for same cost - Lower latency (can get 2 hops between racks instead of 4) But difficult to manage - Cost to build/manage is unclear - Need to use Multipath TCP [Raiciu et al. SIGCOMM 2011] or SPAIN [Mudigonda et al., NSDI 2010] to effectively use available bandwidth

REWIRE future work Structural constraints on topology - Generalize greenfield topology design framework of Mudigonda et al.,usenix ATC 2011 Bisection bandwidth computation algorithm numerically unstable Scale local search approach to larger networks Relationship between spectral gap and bisection bandwidth?

Conclusions Best practices are not enough for data center upgrades Need theory to understand and effectively build heterogeneous networks Implemented LEGUP and REWIRE, optimization algorithms to design heterogeneous DCNs 52 8 20 28 39 32 27 11 70 36 23 30 47 64 13 59 41 2 53 0 24 5