Live Migration of Virtualized Edge Networks: Analytical Modeling and Performance Evaluation

Live Migration of Virtualized Edge Networks: Analytical Modeling and Performance Evaluation Walter Cerroni, Franco Callegati DEI University of Bologna, Italy

Outline Motivations Virtualized edge networks Live migration of virtual machines (VMs) Multiple VM migration model sequential migration parallel migration Numerical results Conclusion 2

Motivations Traditional IP network architecture is showing all its limitations, especially at network edge heterogeneity of today's application service requirements complexity of cross-layer network administration and management multiplicity of L4 L7 network functions executed by many closed and specialized middle-boxes Emerging technologies to foster a paradigm shift SDN to open network devices, separate bare-metal from network intelligence and ease new service deployment NFV to transform middle-boxes into software apps running on standard HW and simplify network administration and management Bring the advantages of cloud to edge networks 3

Reference Scenario User Data Centers at the Edge Mobile User Virtualized Edge Network Core Network User User 4

Virtualized Edge Networks Media Server Web Server User Access Router Firewall NAT Switch Edge Router VM VM VM VM VM Virtual Bridges/Switches SDN NFV User Hypervisor Kernel Standard HW Edge Router 5

Preliminary Test Setup A. Manzalini et al., Clouds of Virtual Machines in Edge Networks, IEEE Com. Mag., July 2013.

Performance with Off-the-Shelf Technology A. Manzalini et al., Clouds of Virtual Machines in Edge Networks, IEEE Com. Mag., July 2013.

Live Migration of Virtual Machines Service virtualization is a widely used technique for data center administration and maintenance Advantages of OS virtualization (Virtual Machines) quick deployment of new service instances effective load balancing and server consolidation easy service replication and migration mobility easy backup and restore procedures Live migration of VMs current state of VM s kernel and running processes is maintained SDN can help maintaining also the network state no need to wait for long shut-down and restart phases no risk of inconsistencies due to duplicate running VM instances clients do not need to disconnect and reconnect DC providers and customers work on fully separated domains 8

Live Migration of Virtual Machines Focus on memory migration storage migration (if needed) through NAS syncronization network state migration through SDN Two approaches pre-copy: push most of the memory pages to destination host before stopping VM at source host post-copy: pull most of the memory pages from source host after resuming VM at destination host We assume the pre-copy approach (adoped by Xen, KVM, VirtualBox, etc.) iterative push phase: memory pages modified in a given round are sent again in the next round, until total size of dirty pages is below a given threshold or a maximum number of iteration is reached stop-and-copy phase: VM is suspended at source host and the remaining dirty pages are copied to destination resume phase: VM is resumed at destination with consistent memory and network state 9

Performance Metrics for VM Live Migration Downtime ( ): amount of time the VM is suspended measures the user s perceived quality Total Migration Time ( ): amount of time needed to copy the whole memory measures the impact of the migration process on both communication infrastructure and computing resource utilization iterative push phase stop-and-copy phase resume phase time copied memory pages dirtied memory pages 10

Multiple VM Live Migration Model number of mutually dependent VMs in the set to be migrated memory size of VM, page dirtying rate of VM memory page size of VM bit rate used to transfer VM number of iterations needed to migrate VM amount of dirty memory of VM to be copied in round, duration of round for VM 11

Simplified Model all VMs in the set have the same amount of memory all VMs in the set show the same fixed page dirtying rate all VMs in the set have the same memory page size the bit rate dedicated to the migration of VM is fixed condition for pre-copy algorithm to be sustainable dirty memory size threshold max no. of iterations total migration time of VM 12

Performance of Multiple VM Live Migration Correlations among VMs require new definitions of performance metrics for the whole set of VMs Total Migration Time starts when the first VM begins the push phase ends when the last VM ends the stop-and-copy phase Downtime starts when the first VM begins the stop-and-copy phase ends when the last VM ends the resume phase Both depend on order of migration migration scheduling strategy amount of bandwidth used to perform VM migration We analize two simple strategies sequential migration parallel migration 13

Sequential vs. Parallel VM Migration Sequential Migration of one VM at a time at full channel bit rate Parallel Simultaneous migration of all VMs equally sharing the channel bit rate Smaller transfer bit rate but same dirtying rate leads to more iterations in parallel migration than in sequential 14

Sequential vs. Parallel VM Migration Trade-off 15

Results: Role of Page Dirtying Rate 16

Results: Number of Iterations 17

Results: Role of Memory Size 18

Results: Role of Dirty Memory Size Threshold 19

Results: Dimensioning the Channel Bit Rate 20

Results: Role of Number of VMs in the Set 21

Results: Role of Critical Subset Size Critical subset: only m VMs out of M must be running to provide the service 22

Conclusion Need for a paradigm shift, especially for edge networks bring the advantages of cloud infrastructures to edge networks SDN and NFV are the key enablers Virtualized edge networks demonstrated with preliminary tests using VMs need to improve live migration performance Multiple VM live migration model performance depend on migration schedule and resources sequential vs. parallel migration trade off resource usage with user s perceived quality Further study on-going VMs with different memory size different bandwidth allocation stategies trade-off holds in general memory transfer synchronization helps limiting the downtime 23