SPARTA: Scalable Per-Address RouTing Architecture

Similar documents
Intel Open Network Platform. Recep Ozdag Intel Networking Division May 8, 2013

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013

DevoFlow: Scaling Flow Management for High-Performance Networks

DevoFlow: Scaling Flow Management for High Performance Networks

Software-Defined Networking (SDN) Overview

Cloud Networking (VITMMA02) Server Virtualization Data Center Gear

Centec V350 Product Introduction. Centec Networks (Suzhou) Co. Ltd R

Cloud Thinking in the Enterprise

BROCADE CLOUD-OPTIMIZED NETWORKING: THE BLUEPRINT FOR THE SOFTWARE-DEFINED NETWORK

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

Lenovo ThinkSystem NE Release Notes. For Lenovo Cloud Network Operating System 10.6

Arista 7300X and 7250X Series: Q&A

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

Enabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures. Dimitra Simeonidou

Provisioning Overlay Networks

Advanced Computer Networks Data Center Architecture. Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015

Routing Domains in Data Centre Networks. Morteza Kheirkhah. Informatics Department University of Sussex. Multi-Service Networks July 2011

SDN AND THE DATAPLANE. CHI-NOG 3 June 14 th, 2014

The Next Opportunity in the Data Centre

Lecture 7: Data Center Networks

Lecture 8 Advanced Networking Virtual LAN. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

ProgrammableFlow: OpenFlow Network Fabric

Arista 7060X, 7060X2, 7260X and 7260X3 series: Q&A

Hardware Evolution in Data Centers

NETWORK ARCHITECTURES AND CONVERGED CLOUD COMPUTING. Wim van Laarhoven September 2010

Arista 7020R Series: Q&A

lecture 18: network virtualization platform (NVP) 5590: software defined networking anduo wang, Temple University TTLMAN 401B, R 17:30-20:00

Achieving Efficient Bandwidth Utilization in Wide-Area Networks While Minimizing State Changes

Lenovo ThinkSystem NE1032/NE1032T/NE1072T. Release Notes. For Lenovo Cloud Network Operating System 10.6

Hochverfügbarkeit in Campusnetzen

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Lecture 7 Advanced Networking Virtual LAN. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Exam HP0-Y43 Implementing HP Network Infrastructure Solutions Version: 10.0 [ Total Questions: 62 ]

Application-Aware SDN Routing for Big-Data Processing

Arista 7170 series: Q&A

VXLAN Overview: Cisco Nexus 9000 Series Switches

OmniSwitch 6900 Overview 1 COPYRIGHT 2011 ALCATEL-LUCENT ENTERPRISE. ALL RIGHTS RESERVED.

Ethernet Fabrics- the logical step to Software Defined Networking (SDN) Frank Koelmel, Brocade

Data Centers. Tom Anderson

Automating Cloud Networking with RedHat OpenStack

Virtualizing The Network For Fun and Profit. Building a Next-Generation Network Infrastructure using EVPN/VXLAN

Lecture 16: Data Center Network Architectures

POLYMORPHIC ON-CHIP NETWORKS

Taxonomy of SDN. Vara Varavithya 17 January 2018

Cloud networking (VITMMA02) DC network topology, Ethernet extensions

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017

Data Center Network Topologies II

White Paper. OCP Enabled Switching. SDN Solutions Guide

FM4000. A Scalable, Low-latency 10 GigE Switch for High-performance Data Centers

Configuring Port Channels

CloudEngine 6800 Series Data Center Switches

Network Myths and Mysteries. Radia Perlman Intel Labs

Configuring Virtual Port Channels

Joel Obstfeld Director of Engineering SP CTO team November Cisco and/or its affiliates. All rights reserved. 1

Managing and Securing Computer Networks. Guy Leduc. Chapter 2: Software-Defined Networks (SDN) Chapter 2. Chapter goals:

Arista 7320X: Q&A. Product Overview. 7320X: Q&A Document What are the 7320X series?

Dynamic Graph Query Support for SDN Management. Ramya Raghavendra IBM TJ Watson Research Center

THE NETWORK AND THE CLOUD

Production OpenFlow Switches Now Available -Building CORD Using OpenFlow Switches CORD Build

Slim Fly: A Cost Effective Low-Diameter Network Topology

OPENFLOW & SOFTWARE DEFINED NETWORKING. Greg Ferro EtherealMind.com and PacketPushers.net

Deploying Data Center Switching Solutions

Brocade and VMware Strategic Partners. Kyle Creason Brocade Systems Engineer

Brocade Ethernet Fabrics

Internet Technology. 15. Things we didn t get to talk about. Paul Krzyzanowski. Rutgers University. Spring Paul Krzyzanowski

CloudEngine 6800 Series Data Center Switches

SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture

VMware vsan Network Design-OLD November 03, 2017

Arista 7280E Series: Q&A

DEPLOYING NFV: BEST PRACTICES

SOFTWARE DEFINED NETWORKING/ OPENFLOW: A PATH TO PROGRAMMABLE NETWORKS

Introduction to Spine-Leaf Networking Designs

Deploy Microsoft SQL Server 2014 on a Cisco Application Centric Infrastructure Policy Framework

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff]

Delay Controlled Elephant Flow Rerouting in Software Defined Network

Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li

Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network

T9: SDN and Flow Management: DevoFlow

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Configuring OpenFlow 1

Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades

Configuring Virtual Port Channels

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

MidoNet Scalability Report

Towards Converged SmartNIC Architecture for Bare Metal & Public Clouds. Layong (Larry) Luo, Tencent TEG August 8, 2018

Provisioning Overlay Networks

Arista 7010 Series: Q&A

Jellyfish: Networking Data Centers Randomly

End to End SLA for Enterprise Multi-Tenant Applications

APNIC elearning: SDN Fundamentals

reinventing data center switching

N-Series Switches IDEAL FOR DATA CENTER NETWORKS AND HIGH-END CAMPUS NETWORKS

Arista 7050X Series: Q&A

Themes. The Network 1. Energy in the DC: ~15% network? Energy by Technology

DATA CENTER FABRIC COOKBOOK

GUIDE. Optimal Network Designs with Cohesity

15-744: Computer Networking. Data Center Networking II

VIRTUAL CLUSTER SWITCHING SWITCHES AS A CLOUD FOR THE VIRTUAL DATA CENTER. Emil Kacperek Systems Engineer Brocade Communication Systems.

CSC 4900 Computer Networks: Network Layer

Transcription:

SPARTA: Scalable Per-Address RouTing Architecture John Carter Data Center Networking IBM Research - Austin

IBM Research Science & Technology IBM Research activities related to SDN / OpenFlow IBM Research started a strategic initiative in data center networking in 2010 Global participation from multiple labs, partnered with product teams SDN is one of the focus areas of the strategic initiative Heavily involved in ONF standards work (esp. FAWG à Table Typing Pattern) SDN applica*ons (examples) Cloud network services Flow replica4on / recovery Security integra4on network control apps, IT- network integra*on SDN advanced controller capabili*es NETWORK APIs NETWORK OPERATING SYSTEM applica*on APIs, network abstrac*ons orchestra*on, workflows, network services network device control and management (plugins / drivers) OpenFlow mgmt tools Network fabric / virtualiza*on SDN DOVE distributed overlay virtual Ethernet Scalable, flexible, converged data center fabric 2011 IBM Corporation

Current SDN uses a tiny fraction of switch capabilities Previously proposed SDN routing architectures: Largely based on OpenFlow 1.0 OpenFlow 1.0 only maps well on to (small) TCAM switch tables Tiny fraction of switch functionality Thus, they often artificially constrain topology and/or addressing Exposed by OpenFlow 1.0 (and thus most of SDN) Switch Functionality 3

SPARTA: Scalable Per-Address RouTing Architecture SPARTA: Simple, HW-efficient, flexible routing mechanism Build one spanning tree per destination host ([VLAN ID, DMAC]) Install one rule per tree per switch in (huge) L2 exact match table Characteristics of SPARTA Supports arbitrary (connected) physical topology Exploits all available paths (statistically) Leaves TCAMs for designed purposes (security, policy-based routing, ) Flexible framework for traffic engineering, traffic steering, failure recovery, quality of service management, 4

Data Center Network Design Goals Scalable 10s to 100s of thousands of hosts Efficient use of bandwidth Mesh topologies from HPC? Efficient host mobility Low latency Respect layering Multi-tenancy Very dynamic à self-configuration Compatible with existing / planned hardware Converged data and storage networks (CEE) 5

A Brief Tour Through a Modern 10GbE Switch 6

Modern Switch Hardware Overview Lots of 10GbE & 40GbE PHYs Powerful Merchant Silicon Switch Chip (Line rate packet forwarding) Embedded Control Plane CPU (Large legacy codebase) 7

Simplified Switch Pipeline (BRCM Trident) Fwd. TCAM L2 Table Rewrite TCAM Packet In Configurable Parse/Lookup Ethernet Parse/Lookup Configurable Rewrite Packet Out Wildcard match Small (~1K) Forwarding rules Exact match Huge (~100K) Forwarding rules ECMP Group Table ECMP Hash Indexed Small (~4K) Wildcard match Small (~1K) Packet rewriting TCAMs: Designed for limited use (security ACLs, PBR, ) L2/FDB table: Huge, plentiful, simple to expand (RAM) ECMP and multicast tables: Additional flexibility 8

Basic SPARTA routing Goal: Route using large L2 table on arbitrary (mesh) topology Solution: Build spanning tree rooted at each destination All links used à approximate load balancing w/o ECMP 9

Constructing SPARTA Routes Basic option: Use BFS to build min-length paths Random Weight links by load

Constructing SPARTA Routes Basic option: Use BFS to build min-length paths Random Weight links by load Some workloads/topologies benefit from non-min routes Non-minimal (NM) PAST Do a BFS from a random switch as the root Change directions on route from root to destination

SPARTA Discussion One L2 entry per switch per tree à scales to > 100K hosts Consumes no TCAM entries for basic routing Obeys layering (does not re-use VLAN tag or other bits) Broadcast/multicast: No change à provide via STP or SDN Security: Use VLANs as normal (or ACLs) Virtualization: Use any higher layer virtualization overlay (e.g., NetLord, SecondNet, MOOSE, VXLAN) 12

SPARTA Implementation 13

SPARTA Implementation Details Address detection and resolution: Uses controller for ARP, DHCP, IPv6 ND, and RS for scalability Route computation: 8,000 hosts à 40µsecs 1ms per tree (300ms per network) 100,000 hosts à 500µsecs 5ms per tree (40s per network) Route installation: 700-1600 new rules per second per switch 2-12ms rule install latency à eagerly install routes Failure recovery: Should patch affected portions of trees first Randomly rebuild trees for link joins 14

SPARTA Performance Simulated to allow evaluation at scale Assume max-min TCP fairness to make simulation feasible Compared against: STP, Valiant routing, ECMP (multipath routing) Workloads: Urand: Uniform random benign Stride-S: Host i sends to host ((i+s)%n) adversarial (intra-rack) Shuffle-K: 128MB to all hosts, random order, K active connections MSR: Synthetically generated from MSR data (light load) Topologies: Equal bisection bandwidth (oversubscription ratios) of EGFT (fat tree), Hyper-X (flattened butterfly), Jellyfish (random) 15

Urand workload on Jellyfish PAST performs as well as ECMP multipath routing 1.0 B=1:2 B=1:1 Aggregate Throughput 0.8 0.6 0.4 0.2 0.0 1K 2K 3K 4K 5K 6K 7K 8K Number of hosts PAST ECMP 1K 2K 3K 4K 5K 6K 7K 8K Number of hosts NM-PAST VAL EthAir STP Spanning Tree performs terribly

Stride workload on Jellyfish Non-Minimal PAST performs better than Valiant load balancing 1.0 B=1:2 B=1:1 Aggregate Throughput 0.8 0.6 0.4 0.2 0.0 1K 2K 3K 4K 5K 6K 7K 8K Number of hosts PAST ECMP 1K 2K 3K 4K 5K 6K 7K 8K Number of hosts NM-PAST VAL EthAir STP Spanning Tree performs terribly

Summary for SPARTA Meets all of our requirements for a DCN by exploiting only the most basic Ethernet forwarding hardware Scalable, low-latency, high-bandwidth network from COTS ToR switches (So we can exploit HPC-style mesh topologies!) Can provide 1-2X performance of ECMP Implemented on existing hardware w/ OF 1.0 (!!!) Leaves TCAM entries for designed uses: PBR, security, Flexible framework for traffic engineering, traffic steering, QoS management, resiliency, For full results, see CoNEXT 2012 paper (next week) 18

Suggestions for SDN Research Understand and exploit what is in the actual hardware Do not let OpenFlow specification restrict your vision but don t assume magical hardware ( unicorns and rainbows ) Consider what can be done by running SDN-aware functions on the control processor (ala HP Labs DevoFlow) Controller understands big picture à guides switch-local decisions Switch firmware can respond in µsecs, not msecs Opportunity: Indigo or similar open source OpenFlow switch firmware Pushing it to the limit à switchlets (Active Networking reborn?) Why just networks? Software-defined everything SDS: software-defined storage (lots of startups claiming this) SDC: software-defined computation (VMs kind of do this) SDDC: software-defined data center 19

TCP Bolt: Faster small flows with lossless Ethernet Lossless è no congestion collapse Send at line-rate immediately 1.5-3X better than vanilla TCP for 64K 8M many real DC flows are this size 20