vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008
Virtual Datacenter OS from VMware
Infrastructure vservices and Cloud vservices Existing New - roadmap
Virtual Datacenter OS from VMware
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
The Need for vnetwork
vnetwork: Comparing Physical to Virtual Physical Switch Physical Switch Physical Conventional access, distribution, core design Design with redundancy for enhanced availability Virtual network is similar to physical network Access layer moved into ESX host as Virtual Switch vswitches allow additional flexibility: instant provisioning and per VM control Virtual Switch Virtual
vnetwork Introduced Flexible Configuration Virtual Switch Virtual Switch Virtual Switch Add vswitches as required Assign guest OS and physical NICs (vmnics) as required Guest OS traffic switched internally
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Current Capabilities (VI 3.5) Brief Recap What is it? Virtual network (i.e., set of virtual switches) living inside ESX providing interconnectivity between VMs and the external physical network Enables many VMs to share physical NICs and communicate directly with each other VI Networking Features (VI 3.5) L2 Ethernet switching (inter-vm traffic) VLAN Segmentation Rate limiting - restrict traffic generated by a VM Built-in server NIC port aggregation (VMware NIC Teaming) Built-in NIC redundancy for enhanced availability and load balancing of physical network resources VI I/O Features (VI 3.5) Enhanced VMXNET, E1000, VLANCE Checksum off-loading, TSO, Jumbo Frames, NetQueue 10GigE, FCoE IB (community support) Copyright 2005 VMware, Inc. All rights reserved.
Standard Virtual Networking (VI 3.5) Virtual switches live inside each ESX host CURRENT vswitch vswitch vswitch Managed as separate virtual networks, one per host
Virtual Networks with VI 3.5 Host1 Host2 Host3 Host4 W2003EE-32-A W2003EE-32-B W2003EE-32-A2 W2003EE-32-B2 W2003EE-32-A3 W2003EE-32-B3 W2003EE-32-A4 W2003EE-32-B4 Virtual Machine Network (VLAN 20) Virtual Machine Network (VLAN 20) Port Group Virtual Machine Network (VLAN 20) Virtual Machine Network (VLAN 20) Virtual Switch Virtual Switch Virtual Switch Virtual Switch Virtual Switch Host1 Virtual Switch Host2 Virtual Switch Host3 Virtual Switch Host4
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Introducing vnetwork Distributed Switch Unified network virtualization management CURRENT vswitch vswitch vswitch VirtualCenter provides abstracted, resource-centric view of networking vnetwork vnetwork Distributed Switch Simplifies network management Moves away from host-level network configuration (cluster level) Statistics and polices follow the VM simplifying debugging and troubleshooting Builds foundation for networking resource pools (view the network as a clustered resource)
What is the vnetwork Distributed Switch? Simplified network management Moves virtual network configuration + management from ESX host level to datacenter level Functions like a single virtual switch but across multiple ESX hosts Defined at the datacenter level (not ESX host level) L2 switching, VLAN tagging, distributed port groups, NIC teaming plus new virtual networking features Maintains network runtime state across VMotion, HA etc. Important for state-full networking features like inline IDS/IPS, firewalls, and third party virtual switches Firewall connection state Virtual networking port statistics
Existing VI virtual networking Host1 Host2 Host3 Host4 W2003EE-32-A W2003EE-32-B W2003EE-32-A2 W2003EE-32-B2 W2003EE-32-A3 W2003EE-32-B3 W2003EE-32-A4 W2003EE-32-B4 Virtual Machine Network (VLAN 20) Virtual Machine Network (VLAN 20) Port Group vswitch Virtual Machine Network (VLAN 20) Virtual Machine Network (VLAN 20) Virtual Switch Virtual Switch Virtual Switch Virtual Switch Virtual Switch Host1 Virtual Switch Host2 Virtual Switch Host3 Virtual Switch Host4
becomes distributed with vnetwork Distributed Switch Host1 Host2 Host3 Host4 W2003EE-32-A W2003EE-32-B W2003EE-32-A2 W2003EE-32-B2 W2003EE-32-A3 W2003EE-32-B3 W2003EE-32-A4 W2003EE-32-B4 Single Distributed Port Group Distributed Virtual Machine Network (VLAN vswitch 20) Single Distributed Switch Single Distributed Virtual Switch Single DS Spanning Host1, Host2, Host3, Host4
VI Virtual Network Solution Comparisons VMware Standard Virtual Switch vnetwork Distributed Switch L2 Forwarding YES YES VLAN Segmentation YES YES 802.1Q Tagging YES YES NIC Teaming YES YES TX Rate Limiting YES YES Unified management YES VM Network Port Block YES PVLAN Support YES Network Runtime State Follows VM YES vnetwork Appliance + Switch APIs YES
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Distributed Virtual Networking 3 rd Party Virtual Switches CURRENT vswitch vswitch vswitch Enterprise networking vendors can provide their own implementations of the virtual switch leveraging the vnetwork switch API interfaces Enables support for 3 rd party networking capabilities, including monitoring and management of the virtual network vnetwork vnetwork Distributed Switch vnetwork Platform Third Party Switch Products vnetwork Platform
VI Virtual Networking - 3 rd Party Virtual Switch Style Host1 Host2 Host3 Host4 W2003EE-32-A W2003EE-32-B W2003EE-32-A2 W2003EE-32-B2 W2003EE-32-A3 W2003EE-32-B3 W2003EE-32-A4 W2003EE-32-B4 Single Distributed Port Group 3 rd Party Distributed Virtual Machine Network vswitch vnetwork Platform Single Distributed Virtual Switch 3 rd Party Distributed Switch Spanning Host1, Host2, Host3, Host4
First 3 rd party vswitch launched at VMworld 09 Cisco Nexus 1000-V 3rd Party Virtual Switches enable end to end physical and virtual networking feature parity Network admins now able to provision and monitor the virtual network using existing physical network management tools Third Party VSwitch Roles and Responsibilities vnetwork Distributed Switch vnetwork (with 3 rd Party virtual switching) Associate VMs to virtual networks VI Admin VI Admin Associate server NICs to virtual networks VI Admin VI Admin Create Virtual Switches VI Admin Network Admin Create Port Groups VI Admin Network Admin Modify VLAN Settings (virtual) VI Admin Network Admin Configure NIC Team VI Admin Network Admin Monitors Virtual Network VI Admin Network Admin
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Physical World Host agent Physical Appliance
Virtual Infrastructure Today ESX ESX Guest agent Physical Appliance
Intermediate Step: Inline vapp Limitations: Configuration More virtual switches More complex topology VMotion No or limited support Disables VC safeguards Performance VMware ESX
Virtual Infrastructure Tomorrow VMware ESX
Packet Interposition/Filter Interposition/Filter Runtime object Instantiated when the protected VM is powered on Operations: State: VMotion Process traffic Save state Restore state Configuration parameters Network traffic The Save state and Restore state methods invoked during VMotion
VMotion-Aware Packet Interposition (VMworld 09 Firewall demo by CheckPoint and IBM/ISS) Filter Filter Filter Filter ESX ESX
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Networking I/O Virtualization Goals: Multiple VM s running on a single host share physical NICs vnics decoupled from physical adapters, enabling VM mobility Enhanced reliability through NIC teaming Achieve above functionalities while delivering high networking performance Minimize CPU cost VMware cares about performance Continually improving performance Evolutionary and revolutionary technology
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
Virtualized I/O Architecture Guest OS Guest Device Driver Device Emulation I/O Stack Physical Device Driver Physical Device Virtual device and driver Model a real device (e1000, vlance) Model a virtualization friendly device (vmxnet) Virtualization I/O stack Virtual Switch, NIC Teaming, Load balancing, Failover, Traffic Isolation/Shaping Physical device and driver Intel, Broadcom,... 10Gig and 1Gig Leverage physical device offloads, e.g. TSO
Performance Enhancing Techniques Cost Reduction Copy avoidance: Zero Copy TX Aggregated processing TCP Segmentation Offload (aka Large Send Offload) Jumbo Frames Large Receive Offload Parallelization (the age of multicore processors) Software architecture and implementation that scales NetQueue parallelism across multiple VMs vnic RSS parallelism within a VM
Single VM 10GigE Performance (Std MTU) Single VM Throughput without Jumbo Frames Throughput (Gbps) 9 8 7 6 5 4 3 2 1 0 8KB - 512B 8KB - 4KB 64KB - 512B 64KB - 8KB 64KB - 16KB 64KB - 64KB 8KB - 512B 8KB - 4KB 64KB - 512B 64KB - 8KB Transmit Receive Netperf Configuration (Socket size - Message size) 64KB - 16KB 64KB - 64KB Netperf microbenchmark TCP_Stream 5 sessions (i.e. 5 connections) Legend: Socket size - msg size Linux Windows Hardware Software ESX Server Client machine HP DL 580 G5 HP DL 380 G5 4X Quad Core Xeon X7350 @ 2.93GHz 16 GB RAM Intel Oplin 10 Gbps dual-port Adapter 2X Quad Core Xeon X5355 @ 2.66GHz 16 GB RAM Intel Oplin 10 Gbps dual-port Adapter ESX Server Client machine ESX server 3.5u1 RHEL5u1 AS kernel: 2.6.18-53.el5 64-bit VMs RHEL5u1 AS kernel: 2.6.18-53.el5 64-bit Windows Server 2003 sp2 Enterprise Ed. 64-bit 1 VCPU 512 MB RAM Virtual device: enhancedvmxnet
Single VM 10GigE Performance (9kB JF) Throughput (Gbps) 10 9 8 7 6 5 4 3 2 1 0 64KB - 8KB Single VM Throughput with Jumbo Frames 64KB -16KB 64KB 128KB 128KB 128KB -64KB - 8KB -16KB -64KB 64KB - 8KB 64KB -16KB 64KB 128KB 128KB 128KB -64KB - 8KB -16KB -64KB Transmit Receive Netperf Configuration (Socket size - Message size) Linux Windows Netperf microbenchmark TCP_Stream 5 sessions (i.e. 5 connections) Legend: Socket size - msg size Benefits bulk transfer, high throughput applications
NetQueue Guest OS(1) Guest OS(2) Guest OS(3) Program Rx queue with Guest MAC address 3 3 Each queue assigned a unique MSI-X interrupt Virtualization Layer 2 2 1. NIC classifies the packet based on MAC address 1 Packets for guest OS 2 and 3 1 DMA to memory for the queue 2. Vmkernel delivers the packet to virtual device 3. Virtual device posts virtual interrupt to the guest OS
Virtualized I/O Summary Can obtain aggregate 10Gig line rate Well tested technology Continual, evolution improvements
Agenda vnetwork -- State of the Art vnetwork Introduction vnetwork Current Product Capability vnetwork -- Future Direction vnetwork Distributed Switch 3 rd Party Virtual Switches vnetwork Appliance and VMsafe-Net API vnetwork I/O Virtualization and Performance Virtualized I/O (aka Emulation) Architecture VMDirectPath (aka Passthrough) Architecture
VMDirectPath I/O I/O Device Device Driver Guest OS Virtualization Layer I/O MMU Direct assignment of PCI devices to VM Guest directly controls the physical H/W Requires I/O MMU DMA Address Translation Protection Not limited to networking device Requires generic device reset FLR, Link Reset,... Highest performance lowest CPU cost but comes with some challenges
Device Sharing Guest OS Device Driver Guest OS Device Driver Guest OS Device Driver SR-IOV: Standard for hardware partitioning Virtual Functions (VF) Device Manager I/O MMU PF Device Driver VF VF VF I/O Device Virtualization Layer PF Assigned to Guest Physical Function (PF) Controlled by hypervisor Device Management Inter VF Switching Embedded switching External switching
Challenges with VMDirectPath I/O Transparent VMotion No simple way to checkpoint device state, etc. VM Management IHV driver for different vendors, device types, revisions, guest OS's Isolation/Security VMsafe API, Promiscuous Mode, Source MAC address spoofing VMware Fault Tolerance Not involved in Guest I/O Memory over-commitment No visibility of DMAs to guest memory VMware working with leading NIC vendors to solve these
VMDirectPath I/O Gen1 No Vmotion support Use case: Appliance VMs or special purpose VMs that do not need Vmotion E.g: Appliance to share local disks, Firewall VM Benefits: Get high performance at lowest CPU cost VM use of devices without virtual counter-part e.g. high-end Graphics, TOE Certain virtualization features are disabled VMotion, Suspend/Resume, Fault Tolerance VM,...
VMDirectPath I/O - Gen2: Uniform Passthrough VMotion support Split the device interface into two parts Passthrough performance critical operations Tx/Rx producer index registers, Interrupt mask register Emulate infrequent operations Management driver running in ESX Uniform H/W and S/W device interface for passthrough part Dynamic switch between emulated (Virtualized IO) vnic or passthrough Guest is not affected or aware of the mode Guest OS Guest Device Driver Device Emulation vswitch pnic Driver/ Management Driver Physical Device
VMDirectPath Gen2: Network Plug-in Architecture (VMware/Intel initiave) Relaxes need for exactly matching hardware design Passthrough performance critical parts Partition guest (Vmxnet) driver Guest OS specific shell Hardware specific module (Plug-in Driver) Guest OS specific Shell Implements interface to the OS network stack Interacts with the hypervisor for configuration Plug-in Driver Interacts with hardware in the data path Uses Shell interface for all OS specific calls ESX controls Plug-in used by Shell Dynamically load H/W specific plugin Map the VF into VM address space Guest OS OS Network Interface Common Entry Points Trap HW CFG PF Driver Vmxnet Driver Code Pages Shell Interface Code Pages IHV Code Pages ESX Data pages IHV Plug-in VMware Guest Driver Shell HW Remap Pages
vnetwork I/O Performance Conclusion VMware has long history of delivering feature-ful Network I/O Virtualization and continual performance enhancements Pipeline of new technologies via both evolutionary (Virtualized IO Emulation) and revolutionary (VMDirectPath I/O) paths in collaboration with major NIC vendors