NaaS Network-as-a-Service in the Cloud

Similar documents
Camdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa

Optimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

A Network-aware Scheduler in Data-parallel Clusters for High Performance

15-744: Computer Networking. Data Center Networking II

Ananta: Cloud Scale Load Balancing. Nitish Paradkar, Zaina Hamid. EECS 589 Paper Review

Hardware Evolution in Data Centers

Lecture 7: Data Center Networks

DevoFlow: Scaling Flow Management for High-Performance Networks

Data Center Network Topologies II

Enabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures. Dimitra Simeonidou

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Cisco Meraki MX products come in 6 models. The chart below outlines MX hardware properties for each model:

Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network

A Scalable, Commodity Data Center Network Architecture

Interdomain Routing Design for MobilityFirst

FOUNDATIONS OF INTENT- BASED NETWORKING

The Software Defined Hybrid Packet Optical Datacenter Network BIG DATA AT LIGHT SPEED TM CALIENT Technologies

Lecture 7 Advanced Networking Virtual LAN. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

c-through: Part-time Optics in Data Centers

MX Sizing Guide. 4Gon Tel: +44 (0) Fax: +44 (0)

White Paper. OCP Enabled Switching. SDN Solutions Guide

Software-Defined Data Centers

TALK THUNDER SOFTWARE FOR BARE METAL HIGH-PERFORMANCE SOFTWARE FOR THE MODERN DATA CENTER WITH A10 DATASHEET YOUR CHOICE OF HARDWARE

Software-Defined Networking (SDN) Overview

Practical Considerations for Multi- Level Schedulers. Benjamin

L19 Data Center Network Architectures

Application-Aware SDN Routing for Big-Data Processing

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

Evolution of Data Center Security Automated Security for Today s Dynamic Data Centers

ProgrammableFlow White Paper. March 24, 2016 NEC Corporation

Innovative Solutions. Trusted Performance. Intelligently Engineered. Comparison of SD WAN Solutions. Technology Brief

Huawei CloudFabric and VMware Collaboration Innovation Solution in Data Centers

RPT: Re-architecting Loss Protection for Content-Aware Networks

15-744: Computer Networking. Middleboxes and NFV

Cloud Networking (VITMMA02) Server Virtualization Data Center Gear

Advanced Computer Networks. Flow Control

VM-SERIES FOR VMWARE VM VM

*Performance and capacities are measured under ideal testing conditions using PAN-OS.0. Additionally, for VM

Pricing Intra-Datacenter Networks with

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017

Design and Implementation of Virtual TAP for Software-Defined Networks

Mellanox Virtual Modular Switch

VNF Chain Allocation and Management at Data Center Scale

Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility

Weiterentwicklung von OpenStack Netzen 25G/50G/100G, FW-Integration, umfassende Einbindung. Alexei Agueev, Systems Engineer

Lecture 10.1 A real SDN implementation: the Google B4 case. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Arista 7020R Series: Q&A

SCTE Event. Metro Network Alternatives 5/22/2013

CellSDN: Software-Defined Cellular Core networks

USING HIGH PERFORMANCE NETWORK INTERCONNECTS IN DYNAMIC ENVIRONMENTS

Cisco CloudCenter Solution with Cisco ACI: Common Use Cases

Virtualization of Data Center towards Load Balancing and Multi- Tenancy

IETF 90: VNF PERFORMANCE BENCHMARKING METHODOLOGY

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Flexible Networking at Large Mega-Scale. Exploring issues and solutions

Data Center Configuration. 1. Configuring VXLAN

The MapReduce Abstraction

LAN design. Chapter 1

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

MapReduce: Simplified Data Processing on Large Clusters

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.

Data Center Fundamentals: The Datacenter as a Computer

GUIDE. Optimal Network Designs with Cohesity

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Networking Acronym Smorgasbord: , DVMRP, CBT, WFQ

IBM Best Practices Working With Multiple CCM Applications Draft

How DPI enables effective deployment of CloudNFV. David Le Goff / Director, Strategic & Product Marketing March 2014

Routing Domains in Data Centre Networks. Morteza Kheirkhah. Informatics Department University of Sussex. Multi-Service Networks July 2011

Carrier SDN for Multilayer Control

Lecture 5: Active & Overlay Networks"

Towards a Universal Stream Processing System Robert Soulé Cornell University

Exploring Cloud Security, Operational Visibility & Elastic Datacenters. Kiran Mohandas Consulting Engineer

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Lecture 8 Advanced Networking Virtual LAN. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

BGP. Daniel Zappala. CS 460 Computer Networking Brigham Young University

Cloud 3.0 and Software Defined Networking October 28, Amin Vahdat on behalf of Google Technical Infratructure Google Fellow

QoS Services with Dynamic Packet State

CloudLab. Updated: 5/24/16

Connectivity Services, Autobahn and New Services

Unity EdgeConnect SP SD-WAN Solution

SCREAM: Sketch Resource Allocation for Software-defined Measurement

DEFINING SECURITY FOR TODAY S CLOUD ENVIRONMENTS. Security Without Compromise

SEER: LEVERAGING BIG DATA TO NAVIGATE THE COMPLEXITY OF PERFORMANCE DEBUGGING IN CLOUD MICROSERVICES

Towards Predictable Datacenter Networks

Making Network Functions Software-Defined

OpenFlow: What s it Good for?

MoB: A Mobile Bazaar for Wide Area Wireless Services. R.Chakravorty, S.Agarwal, S.Banerjee and I.Pratt mobicom 2005

Research Challenges in Cloud Computing

Small-World Datacenters

On low-latency-capable topologies, and their impact on the design of intra-domain routing

MidoNet Scalability Report

Scalable Distributed Training with Parameter Hub: a whirlwind tour

Network layer: Overview. Network layer functions IP Routing and forwarding NAT ARP IPv6 Routing

SPARTA: Scalable Per-Address RouTing Architecture

Cross-Site Virtual Network Provisioning in Cloud and Fog Computing

TungstenFabric (Contrail) at Scale in Workday. Mick McCarthy, Software Workday David O Brien, Software Workday

Network Support for Multimedia

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik

Transcription:

NaaS Network-as-a-Service in the Cloud joint work with Matteo Migliavacca, Peter Pietzuch, and Alexander L. Wolf costa@imperial.ac.uk

Motivation Mismatch between app. abstractions & network How the programmers see the network NaaS: Network-as-a-Service in the Cloud 2/24

Motivation Mismatch between app. abstractions & network How the network really looks like NaaS: Network-as-a-Service in the Cloud 3/24

Motivation Mismatch between app. abstractions & network What programmers really see of the network int send(int sockfd, const void *msg, int len, int flags); int recv(int sockfd, void *buf, int len, int flags); No control over network resources Hard to map distributed apps onto the physical topology NaaS: Network-as-a-Service in the Cloud 4/24

Example #1: MapReduce Many cloud data centers have high degree of oversubscription (e.g., 1:20[IMC 10]) Intra-rack bandwidth >> inter-rack bandwidth Location of map and reduce tasks is critical Current approach Reverse-engineer the network Combination of low-level tools (ping, traceroute, ) and complex clustering algorithms Issues Low-level process Time consuming Potentially inaccurate 70% of cross track traffic is reduce traffic 50% reduce phases takes 62% longer than ideal placement Source: Ananthanarayanan et al. OSDI 10 NaaS: Network-as-a-Service in the Cloud 5/24

Example #1: MapReduce Many cloud data centers have high degree of oversubscription (e.g., 1:20[IMC 10]) Intra-rack bandwidth >> inter-rack bandwidth Location of map and reduce tasks is critical Current approach Reverse-engineer the network Combination of low-level tools (ping, traceroute, ) and complex clustering algorithms Issues Low-level process Time consuming Potentially inaccurate NaaS: Network-as-a-Service in the Cloud 6/24

Example #1: MapReduce Wish list #1: Network Visibility Tenants are provided with an (abstract) view of their allocated VMs No need for reverse-engineering, easier deployment NaaS: Network-as-a-Service in the Cloud 7/24

Example #2: Iterative Jobs Iterative jobs often adopt an one-to-many communication pattern e.g., Netflix Collaborative Filtering Current approach Point-to-point Application-level multicast tree BitTorrent-like solutions Issues The server 1Gbps link is the bottleneck Even perfect network visibility would not help Source: Chowdhury et al. SIGCOMM 11 NaaS: Network-as-a-Service in the Cloud 8/24

Example #2: Iterative Jobs Iterative jobs often adopt an one-to-many communication pattern e.g., Netflix Collaborative Filtering Current approach Point-to-point Application-level multicast tree BitTorrent-like solutions Each child only gets 500 Mbps ToR Switch Issues The server 1Gbps link is the bottleneck Even perfect network visibility would not help NaaS: Network-as-a-Service in the Cloud 9/24

Example #2: Iterative Jobs Wish list #2: Custom Forwarding Tenants can implement custom routing protocols E.g., anycast, multicast, content-based routing, key-based routing, multi-path routing, NaaS: Network-as-a-Service in the Cloud 10/24

Example #3: Interactive Queries Many large-scale web services use a partition-aggregate pattern Queries are sent to multiple workers and responses are combined Current approach Responses from workers are aggregated at intermediate layers E.g., Google Search Issues Requires high network bandwidth Aggregator servers become the bottlenecks Custom forwarding would not help Source: Jeff Dean. WSDM 09 NaaS: Network-as-a-Service in the Cloud 11/24

Example #3: Interactive Queries Many large-scale web services use a partition-aggregate pattern Queries are sent to multiple workers and responses are combined Current approach Responses from workers are aggregated at intermediate layers E.g., Google Search Issues Requires high network bandwidth Aggregator servers become the bottlenecks Custom forwarding would not help Parent Server Each flow only gets 500 Mbps Workers NaaS: Network-as-a-Service in the Cloud 12/24

Example #3: Interactive Queries Wish list #3: In-network Processing Tenants can perform arbitrary packet processing on path E.g., in-network aggregation [Camdoop@NSDI 12], opportunistic caching, semantic de-duplication NaaS: Network-as-a-Service in the Cloud 13/24

Introducing NaaS Goal Mechanisms and abstractions to enable cloud tenants to efficiently, easily, and safely process packets within the network This entails visibility over network resources, custom forwarding and processing of packets Providers would benefit too Today they also need to reverse-engineer applications NaaS would allow more fine-grained traffic engineering This is complementary to SDN / OpenFlow /... Focus on application-specific rather than application-agnostic services but some techniques can be re-used NaaS: Network-as-a-Service in the Cloud 14/24

Why Now? DCs are not mini-internets Single owner / administration domain We know (and define) the topology Low hardware and software (network protocols) diversity Trusted components (e.g., hypervisors) Several proposals for software-based routers RouteBricks, ServerSwitch, PacketShader, SideCar, NetMap, Typically, these are used to replace traditional (applicationagnostic) network services (e.g., IPv4 forwarding, DPI) Why don t use them also to implement application-specific services? E.g., aggregate packets in a distributed query or content-based routing NaaS: Network-as-a-Service in the Cloud 15/24

NaaS Architecture Switches are augmented with processing capabilities (NaaS box) Software routers a la Routebricks or hybrid solutions like ServerSwitch (Oversubscribed) Fat-tree-like topology Lower in-bound switch throughput E.g., for a 27K-server, max throughput is 48 Gbps Tenants deploy their processing elements (INPE) on each NaaS box Fast-path for non-naas traffic NaaS: Network-as-a-Service in the Cloud 16/24

NaaS Architecture Switches are augmented with processing capabilities (NaaS box) Software routers a la Routebricks or hybrid solutions like ServerSwitch (Oversubscribed) Fat-tree-like topology Lower in-bound switch throughput E.g., for a 27K-server, max throughput is 48 Gbps Tenants deploy their processing elements (INPE) on each NaaS box Fast-path for non-naas traffic NaaS: Network-as-a-Service in the Cloud 17/24

(Preliminary) Evaluation Questions: What are the benefits for NaaS users? What is the impact for non-naas users? What is the processing rate required? Setup Flow-level simulator 8,192-server fat-tree topology (32-Gbps switches) 80% traditional TCP flows, 20 % combination of multicast, aggregation, caching NaaS: Network-as-a-Service in the Cloud 18/24

Time normalized to baseline Total flow completion time Worse Better 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% R=1 Gbps already reduces time by 65% 96% time reduction 1 2 4 8 16 32 NaaS box maximum processing rate in Gbps (R) NaaS: Faster Network-as-a-Service NaaS in the boxes Cloud 19/24

Time normalized to baseline Total flow completion time Worse Better 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% R=1 Gbps already reduces time by 65% 1 2 4 8 16 32 NaaS box maximum processing rate in Gbps (R) Even a low processing rate is enough to achieve NaaS: Network-as-a-Service significant in the Cloud benefits 96% time reduction

CDF (flows) Individual Flow Completion Time 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 This includes non-naas flows too Baseline R=1 Gbps R=2 Gbps R=4 Gbps R=8 Gbps R=16 Gbps R=32 Gbps 0 200 400 600 800 1000 Flow completion time (s) NaaS: Network-as-a-Service in the Cloud 21/24

CDF (flows) Individual Flow Completion Time 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 This includes non-naas flows too Baseline R=1 Gbps R=2 Gbps R=4 Gbps R=8 Gbps R=16 Gbps R=32 Gbps 0 200 400 600 800 1000 Flow completion time (s) The use of NaaS is beneficial for all flows (including NaaS: Network-as-a-Service non-naas in the Cloud ones)

Challenges Scalability and performance isolation Traditional software routers assume handful of trusted services In NaaS we expect 10s or 100s of (potentially malicious or poorly written) INPEs per switch Programming abstractions We should not expose the actual network programming o Too complex for many users o Sensitive information Trade-off between flexibility and performance Pricing schemes How we should charge tenants? NaaS: Network-as-a-Service in the Cloud 23/24

Summary Currently tenants have little control over the network NaaS focuses on enabling tenants to deploy applications within the network Efficiency Simplified development Providers benefit too On-going work SideCar[HotNets 11] inspired design Transparent acceleration of mainstream applications NaaS: Network-as-a-Service in the Cloud 24/24