The Pennsylvania State University. The Graduate School MEMORY ANALYSIS TOWARDS MORE EFFICIENT LIVE MIGRATION OF APACHE WEB SERVER.

Size: px

Start display at page:

Download "The Pennsylvania State University. The Graduate School MEMORY ANALYSIS TOWARDS MORE EFFICIENT LIVE MIGRATION OF APACHE WEB SERVER."

Dorcas Banks
5 years ago
Views:

1 The Pennsylvania State University The Graduate School MEMORY ANALYSIS TOWARDS MORE EFFICIENT LIVE MIGRATION OF APACHE WEB SERVER AThesisin Computer Science and Engineering by Wenqi Cao c 2015 Wenqi Cao Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science May 2015

2 The thesis of Wenqi Cao was reviewed and approved by the following: Peng Liu Graduate Faculty of Computer Science and Engineering and Professor of Information Science and Technology Thesis Advisor Guohong Cao Professor of Computer Science and Engineering Lee Coraor Associate Professor of Computer Science and Engineering Director of Academic A airs of the Department of Computer Science and Engineering Signatures are on file in the Graduate School.

3 Abstract Virtual Machine live migration is a key technique in the clouds. It can benefit for the workload balance and data backup. During the virtual machine live migration, memory is an important part that needs to be transferred from the source to the destination. Current techniques can be divided into two categories: Pre-Copy and Post-Copy. Both approaches migrate all memory blocks from source to target without memory analysis before live migration. However, some memory blocks are not necessarily transferred through live migration process, because the content of these memory blocks always remains the same. Moreover, modern operating systems use unused memory to cache recently accessed blocks of permanent storage device. This technique is called PreFetcher or SuperFetch. Although Prefetcher can speed up the amount of time that system takes to start up programs, it occupies more memory spaces which might not be used by users. iii

4 These observations lead us to propose an approach to analyze the Virtual Machine memory page table and divide the memory data into two parts (active memory and relatively stable memory). Instead of migrating all Virtual Machine memory through live migration, we only live migrate the active memory. The relatively stable memory can be merged into Virtual Machine image file. Thus, during startup process, target machine can directly load these memory blocks into its physical memory through image file stored in shared storage. The proposed approach can considerably reduce the amount of data sent between the two hosts involved in the live migration and has the potential to shorten the total migration time. iv

5 Table of Contents List of Figures List of Tables Acknowledgments vii ix x Chapter 1 Introduction Virtualization and Live Migration More E cient Live Migration Proposed Method Chapter 2 Background Shadow Page Table Nested Paging Kernel-based Virtual Machine Chapter 3 Related Work Pre-Copy Live Migration Post-Copy Live Migration Live Migration Based on Full System Trace and Replay Live Migration with Adaptive Memory Compression Delta Compression Techniques for E cient Live Migration Summary v

6 Chapter 4 Implementation Method Description Relationship of Live Migration E ciency and Memory Reduction EPT Violation Interception Dirty Page Detection RSM Calculation Chapter 5 Result Experiment Setup Boundary between Two Running Status RSM Result Conclusion and Future Work Bibliography 67 vi

7 List of Figures 1.1 Proposed Live Migration Approach Shadow Page Tables for Memory Mapping Memory Mapping of Guest VMs NPT Infrastructure bit MMU-Intensive Kernel Microbenchmark Results bit MMU-Intensive Kernel Microbenchmark Results KVM Memory Map Guest Execution Loop Pre-Copy Migration Timeline Timeline for Pre-Copy vs. Post-Copy Process of Trace and Replay Migration Memory Page Characteristics Analysis Source Side Delta Compression Scheme Destination Side Delta Compression Scheme Memory Changing During Two Phases D Walk The Process of EPT Violation Handling For Each Shadow Entry Shadow Walk Okay Shadow Walk Next MMU Set SPTE (1) MMU Set SPTE (2) KVM Memory Slot Get Dirty Log (1) Get Dirty Log (2) RSM Calculation Code (1) RSM Calculation Code (2) Experimental Host Configuration vii

8 5.2 Pressure Test Programm Result Number of New EPT Records by Increasing Time EPT Record Changing Result MCR and IMR Memory Changing Rate in 60 Minutes under 200M Pressure Test Memory Changing Rate in 60 Minutes under 400M Pressure Test Memory Changing Rate in 60 Minutes under 600M Pressure Test Memory Changing Rate in 60 Minutes under 800M Pressure Test Memory Changing Rate in 60 Minutes under 1000M Pressure Test. 66 viii

9 List of Tables 4.1 Types Related to Memory Changing Experimental Result under 5 Di erent Pressure Tests ix

10 Acknowledgments I would like to thank my advisor Dr. Peng Liu. He has provided significant guidance and insight throughout my time at The Pennsylvania State University. With his patience during this thesis research, I was able to pursue success without hesitation. I would also like to thank Dr. Guohong Cao and other professors and colleagues who helped me during my research by providing much needed advices. My opportunities to take part in their research groups were very rewarding. Above all, I want to extend my deepest gratitude to my wife and parents. Without their infinite patience, I would never be where I am today. They have been supporting me through very di cult times and are always encouraging me to pursue all of my endeavors. x

11 Chapter 1 Introduction 1.1 Virtualization and Live Migration Live Migration of Virtual Machine (VM) across distinct physical hosts is a very important feature of virtualization technology for maintenance, load-balancing and energy reduction, especially for data centers operators. Over the past few years, the availability of fast networks has led to a shift from running services on privately owned and managed hardware to co-locating those services in data center. Along with the wide spread availability of fast networks, the key technology that enable this shift is virtualization [1]. In a virtualized environment, the software does not run directly on bare metal hardware anymore but instead on virtualized hardware. The environment, such as the number of CPUs, the amount of RAM, the disk space and so forth, can be tailored to the customers exact needs. From the perspective of the data center operator, virtualization provides the opportunity to co-locate several VMs on one physical server. This consolidation reduces the cost

12 2 for hardware, space and energy. An important feature of virtualization technology is live migration [2]. By live migration, it is possible to move a VM from one host to another without shutting it down, increasing flexibility in VM provisioning. Live migration techniques for VMs focus on capturing and transferring the run time, in memory, state of a VM over a network. Live migration enables administrators to reconfigure virtual machine hardware and move running virtual machines from one host to another while maintaining near continuous service availability. This ability to migrate applications encapsulated within VMs without perceivable downtime is becoming increasingly important in order to provide continuous operation and availability of services in the face of outages, whether planned or unplanned. Live migration also increases e ciency as it is possible to manage varying demands in an organization with fewer physical servers and less human e ort by enabling on-the-fly server consolidation. So far, live migration research has focused on transferring the run-time inmemory state of VMs with relative limited memory size in Local Area Networks (LAN) [3]. The usability of current live migration techniques is limited when large VMs or VMs with high memory loads such as VMs running large enterprise applications have to be migrated, or when migration is performed over slow networks. The reason for this shortcoming is that the hypervisor is unable to transfer the VM memory in the same rate as it is dirtied by the VM. This limitation can easily neutralize the advantage of live migration.

13 3 1.2 More E cient Live Migration In this thesis, more e cient means less amount of memory data shipped from source virtual machine to target virtual machine. To migrate a running VM across distinct physical hosts, its complete state has to be transferred from the source to the target host. The state of a VM includes the permanent storage (data in disks), volatile storage (data in memory), the state of connected devices (network interface cards) and the internal state of virtual CPUs. In most setups the permanent storage is provided through network-attached storage and does thus not need to be moved. The state of the virtual CPUs and the virtual devices comprise a few kilobytes of data and can be easily sent to the target host. The main caveat in migrating live Virtual Machines with several gigabytes of main memory is thus moving the volatile storage e ciently from one host to the other. The problem of memory data migration is very challenging. However, through our observation, there are several relatively stable memory data in main memory. If utilizing this reasonably, we can reduce the amount of data during live migration. Because of the following three points, we decided to propose a memory analysis approach that towards more e cient live migration of Apache web server. Firstly, modern operating systems use unused memory to cache recently accessed blocks of permanent storage device. This technique usually is called Perfetcher or SuperFetch, which is a component of Memory Manager that can speed up the operating system boot process and shorten the amount of time it takes to start up programs. It accomplishes this by caching files that are needed by an

14 4 application to RAM as the application is launched, thus consolidating disk reads and reducing disk seeks. Although Prefetcher can speed up the amount of time that operating system takes to startup programs, it occupies more memory spaces which might not be used by users. Secondly, generally speaking, there are three kinds of web servers. They are static content server, dynamic web application server and video streaming server. They dynamic web workload generates a large number of writes in bursts. The static web workload generates a medium, roughly constant number of writes. And the video streaming workload creates relatively few writes but is very latency sensitive [4]. These results mean that memory changing rate and page dirty rate of dynamic web application server and video streaming server change greatly and are very unpredictable, but memory changing rate of static content server is totally opposite. Thirdly, current techniques of Virtual Machine live migration can be divided into two categories: pre-copy [2] and post-copy [5]. Both approaches migrate all memory blocks from source to target without memory analysis for reducing migrated memory size. However, some memory blocks are not necessarily transferred through live migration process, because they always remain the same. 1.3 Proposed Method Before o ering a concrete statement of the problems we address through our work, I describe our method in a high level. Live migration process can be consid-

15 5 ered as three phases: Push Phase The source Virtual Machine continues running while certain pages are pushed across the network to the new destination. To ensure consistency, pages modified during this process must be re-sent. Stop-and-copy Phase The source Virtual Machine is stopped, pages are copied across to the destination Virtual Machine, then the new Virtual Machine is started. Pull Phase The new Virtual Machine executes and, if it accesses a page that has not yet been copied, this page is faulted in across the network from the source Virtual Machine [5]. Base on the three phases, in this thesis, our goal is to reduce the amount of memory data is shipped through analyzing the Virtual Machine memory page table and dividing the memory data into two parts (active memory and relatively stable memory). Instead of migrating all Virtual Machine memory through live migration, we only live migrate the active memory. The relatively stable memory can be merged into Virtual Machine image file before live migration. Thus, during startup process, the target machine can directly load relatively stable memory blocks into its physical memory through image file stored in the shared storage. The proposed approach can considerably reduce the amount of data sent between two hosts involved in the live migration and has the potential to shorten the total migration time. Our proposed approach measures memory changing rate before live migration in order to separate active memory blocks and relatively stable memory blocks. We monitor Apache static web servers and intercept and analyze statistically the write

6 operations and memory allocation operations inside memory. We define memory changing rate based on the sequence of write operations and use memory changing rate a lot.

16 6 operations and memory allocation operations inside memory. We define memory changing rate based on the sequence of write operations and use memory changing rate a lot. In contrast, the existing techniques only monitor page fault during live migration for dirty pages re-transmission. They don t measure and use memory changing rate. The high level proposed live migration process is showed in Figure 1.1. The detailed design, implementation and experimental test will be addressed in Chapter four and five. Figure 1.1. Proposed Live Migration Approach. The rest of this thesis is organized as follows. Chapter 2 introduce the background and some key techniques of this thesis. Chapter 3 discusses related work to this study. Chapter 4 describes the mythology and implementation of this work. Chapter 5 presents and discusses the experimental setup and result. I end the study with concluding remarks and a discussion of possible future work in the last section of Chapter 5.

17 Chapter 2 Background In this chapter, we introduce the basic components of the Virtual Machine live migration, and discuss their features. The material is categorized into: shadow page table, EPT/NPT, and finally, a brief overview of the KVM VMM. 2.1 Shadow Page Table The operating system in each guest Virtual Machine maintains its own page tables. Their tables reflect the virtual-to-real memory mapping that the guest OS manages. As opposed to this virtual-to-real mapping, the virtual-to-physical mapping is kept by the VMM in shadow page tables [6, 7], one for each of the guest Virtual Machines. Figure 2.1 illustrates the shadow page tables for the example of Figure 2.2. These tables are the ones actually used by hardware to translate virtual addresses and to keep the TLB up-to-date. The entries in these shadow page tables essentially eliminate one level of indirection in the virtual-to-real-to-

18 8 physical mapping. Figure 2.1. Shadow Page Tables for Memory Mapping. To make this method work, the page table pointer register is virtualized. The VMM manages the real page table pointer and has access to the virtual version of the register associated with each guest Virtual Machine. At the time the VMM activates a guest Virtual Machine, it updates the page table pointer so that it indicates the correct shadow version of the guests current page table. If a guest attempts to access the page table pointer, either to read it or write it, the read or write instruction traps to the VMM. The trap occurs either automatically because

19 9 Figure 2.2. Memory Mapping of Guest VMs. these instructions are privileged, or because code patching has replaced them with a trap. If the attempt to access the page table pointer by the guest is a read attempt, the VMM returns the guests virtual page table pointer; whereas if it is a write attempt, the VMM updates the virtual version and then updates the real page table pointer to point to the corresponding shadow table. The true mapping of virtual to physical pages may di er from the virtual to real view that the guest operating systems have, and page fault handing must take this into account. First note that the VMM should not have a virtual-to-physical page mapping in a shadow table if the guest OS does not have the same virtual page mapped to real memory in its corresponding virtual table. Otherwise, an access

20 10 that should page fault from the guests perspective will not cause a page fault in the Virtual Machine environment, thereby breaking the equivalence property. Therefore, when a page fault does occur, the page may or may not be mapped in the virtual table of the guest OS. If it is mapped, this page fault should be handled entirely by the VMM. This is a case where the VMM has moved the assessed real page to its own swap space. Consequently, the VMM brings the real page back into physical memory and then updates the real map table and the a ected shadow table appropriately to reflect the new mapping. The guest OS is not informed of the page fault because such a page fault would not have occurred if the guest OS were running natively. On the other hand, if the page is not mapped in the guest, the VMM transfers control to the trap handler of the guest, indicating a page fault. The guest OS then issues I/O requests to e ect a page-in operation (possibly with a swap out of a dirty page). The guest OS then issues instructions to modify its page table. These requests are intercepted by the VMM, either because they are privileged instructions or because the VMM write-protects the area of memory holding the page table of the guest. At that point the VMM updates the page table and also updates the mapping in the appropriate shadow page table before returning control back to guest virtual machine. The real map table contains a mapping of the real pages of each virtual machine to the physical pages of the system. When performing I/O with real addresses, the VMM converts the real addresses presented by a virtual machine to physical

21 11 addresses using the real map table. Input/output address mapping turns out to be somewhat tricky because contiguous real pages may not be contiguous in physical memory. Thus the VMM may need to convert an I/O request that spans multiple pages into multiple I/O requests, each referring to a contiguous block of physical memory [8]. 2.2 Nested Paging Nested paging is a hardware solution for alleviating the software memory management overhead imposed by system virtualization. Nested paging complements existing page walk hardware to form a two-dimensional (2D) page walk, which reduces the need for hypervisor intervention in guest page table management. However the extra dimension also increases the maximum number of architecturallyrequired page table references [9, 10, 11]. To avoid the software overhead of shadow paging, many hardware mechanisms have been proposed to avoid the intervention of the hypervisor for memory management. One such technique is nested paging, in which the guest page table converts guest virtual address to guest physical address, while a new table, the nested page table, is introduced to map guest physical address to system physical address. The guest remains in control over its own page table without hypervisor intercepts. Paging control bits and CR3 are duplicated to allow the nested paging table base and mode to be independent from the guest. When an address translation is required, the 2D page walk hardware traverses the guest page table to map

22 12 guest virtual address to guest physical address, with each guest physical address requiring a nested page table walk to obtain the system physical address. Figure 2.3. NPT Infrastructure. As you can see in the Figure 2.3, a CPU with hardware support for the nested paging caches both the virtual memory (Guest OS) to physical memory (Guest OS) as the physical memory (Guest OS) to real physical memory transition in the TLB. The TLB has a new Virtual Machine specific tag, called the Address Space Identifier (ASID). This allow the TLB to keep track of which TLB entry belongs to which Virtual Machine. The result is that a Virtual Machine switch does not flush the TLB. The TLB entries of the di erent virtual machines all coexist peacefully in the TLB. This makes the VMM a lot simpler and completely annihilates the need to update the shadow page tables constantly. If we consider that the Hypervisor has to intervene for each update of the shadow page tables, it is clear that nested paging can seriously improve performance. Nested paging is especially important if you have more than one virtual CPU per Virtual Machine. Multiple CPUs have to

13 sync the page tables often, and as a result the shadow page tables have to update alotmoretoo. Theperformancepenaltyofshadowpagetablesgetsworseas you use more CPUs per Virtual Machine.

23 13 sync the page tables often, and as a result the shadow page tables have to update alotmoretoo. Theperformancepenaltyofshadowpagetablesgetsworseas you use more CPUs per Virtual Machine. With nested paging, the CPUs simply synchronize TLBs as they would have done in a non-virtualized environment. Figure bit MMU-Intensive Kernel Microbenchmark Results. EPT/NPT-enabled CPUs o oad a significant part of the VMMs MMU virtualization responsibilities to the hardware, resulting in higher performance. Results of experiments done on this platform indicate that the current VMM leverages these features quite well, resulting in performance gains of up to 48% for MMU-intensive benchmarks and up to 600% for MMU-intensive microbenchmarks [12].

24 14 Figure bit MMU-Intensive Kernel Microbenchmark Results. 2.3 Kernel-based Virtual Machine The Kernel-based Virtual Machine, or KVM, is a new Linux subsystem which leverages virtualization extensions to add a virtual machine monitor (or hypervisor) capability to Linux. Using KVM, one can create and run multiple virtual machines. These virtual machines appear as normal Linux processes and integrate seamlessly with the rest of the system [13]. Under KVM, virtual machines are created by opening a device node (/dev/kvm). A guest has its own memory, separate from the userspace process that created it. A virtual CPU is not scheduled on its own, however. KVM is structured as a fairly typical Linux character device. It exposes a /dev/kvm device node which can be used by userspace to create and run virtual machines through asetofioctl()s.theoperationsprovideby/dev/kvminclude:

25 15 i) Creation of a new virtual machine. ii) Allocation of memory to a virtual machine. iii) Reading and writing virtual CPU registers. iv) Injecting an interrupt into a virtual CPU. v) Running a virtual CPU. Figure 2.6 shows how guest memory is arranged. Like user memory in Linux, the kernel allocates discontinuous pages to form the guest address space. In addition, userspace can mmap() guest memory to obtain direct access. This is useful for emulating dma-capable devices. Running a virtual CPU deserves some further elaboration. In e ect, a new execution mode, guest mode is added to Linux, joining the existing kernel mode and user mode. Guest execution is performed in a triply-nested loop: i) At the outermost level, userspace calls the kernel to execute guest code until it encounters an I/O instruction, or until an external event such as arrival of a network packet or a timeout occurs. External events are represented by signals. ii) At kernel level, the kernel causes the hardware to enter guest mode. If the processor exits guest mode due to an event such as an external interrupt or a shadow page table fault, the kernel performs the necessary handing and resumes guest execution. If the exit reason is due to an I/O instruction or a signal queued to the process, then the kernel exits to userspace. iii) At the hardware level, the processor executes guest code until it encounters an instruction that needs assistance, a fault, or an external interrupt. Refer to Figure 2.7 for a flowchart representation of the guest

26 16 Figure 2.6. KVM Memory Map. execution loop. As with all modern processors, x86 provides a virtual memory system which translates user-visible virtual addresses to physical addresses that are used to access the bus. This translation is performed by the memory management unit, or mmu. The mmu consists of: i) A radix tree, the page table, encoding the virtual-to-physical translation. This tree is provided by system software on physical memory, but is rooted in a hardware register (the cr3 register)

27 17 ii) A mechanism to notify system software of missing translations (page faults) iii) An on-chip cache (the translation lookaside bu er,or tlb) that accelerates lookups of the page table iv) Instructions for switching the translation root in order to provide independent address spaces v) Instructions for managing the tlb

28 Figure 2.7. Guest Execution Loop. 18

29 Chapter 3 Related Work 3.1 Pre-Copy Live Migration Live migration is being researched and a number of techniques have been proposed to migrate a running Virtual Machine from one host to another. The predominant approach for live migration is pre-copy. The bare-metal hypervisors VMware [14], KVM [13] and Xen [15], plus hosted hypervisors such as VirtualBox [16] employ a pre-copy approach. To reduce the downtime of the Virtual Machine, the state of Virtual Machine is copied in several iterations [4]. While transferring the state of the last iteration, the Virtual Machine continues to run on the source machine. Pages that are modified during this transfer are recorded and need to be re-transmitted in the following iterations to ensure consistency. The iterative push phase is followed by a very short stop-and-copy phase during which the remaining modified memory pages as well as the state of the virtual CPUs and the devices are transferred to the target host. The per-copy approach achieves a very short

30 20 downtime in the best case, but for memory-write-intensive workloads the stop-andcopy phase may increase to several seconds. Remote Direct Memory Access on top of modern high-speed interconnects can significantly reduce memory replication during migration. Figure 3.1. Pre-Copy Migration Timeline. There are six stages in Pre-Copy migration. Stage-0 Pre-Migration: We begin with an active Virtual Machine on physical host A. To speed any further migration, a target host may be preselected where the resources required to re-

31 21 ceive migration will be guaranteed. Stage-1 Reservation: Arequestisissued to migrate an OS from host A to host B. We initially confirm that the necessary resources are available on B and reserve a Virtual Machine container of that size. Failure to secure resources here means that the Virtual Machine simply continues to run on A una ected. Stage-2 Iterative Pre-Copy: During the first iteration, all pages are transferred from A to B. Subsequent iterations copy only those pages dirtied during the previous transfer phase. Stage-3 Stop-and-Copy: We suspend the running OS instance at A and redirect its network tra c to B. CPU state and any remaining inconsistent memory pages are then transferred. At the end of this stage there is a consistent suspended copy of the Virtual Machine at both A and B. The copy at A is still considered to be primary and is resumed in case of failure. Stage-4 Commitment: Host B indicates to A that it has successfully received a consistent OS image. Host A acknowledges this message as commitment of the migration transaction: host A may now discard the original Virtual Machine, and host B becomes the primary host. Stage-5 Activation: The migrated Virtual Machine on B is now activated. Post-migration code runs to reattach device drivers to the new machine and advertise moved IP addresses. Detailed analysis of a migration of a SPECweb99 running in the middle of its execution can be found in [4]. The testing virtual machine was configured with 800MB of memory. The x-axis shows time elapsed since start of migration, while the y-axis shows the network bandwidth being used to transfer pages to the destination. Because of no memory compression or related techniques, the total

32 22 size of data transmitted is 960MB. 3.2 Post-Copy Live Migration Post-copy techniques take the opposite approach. Post-copy first transmits all processor state to the target, starts the Virtual Machine at the target, and then actively pushes the Virtual Machines memory pages from source to target [5]. Concurrently, any memory pages that are faulted on by the Virtual Machine at target, and not yet pushed, are demand-paged over the network from source. Post-copy thus ensures that each memory page is transferred at most once, thus avoiding the duplicate transmission overhead of pre-copy. E ectiveness of post-copy depends on the ability to minimize the number of network-bound page-faults, by pushing the pages from source before they are faulted upon by the Virtual Machine at target. This approach achieves a very short down-time but incurs a rather large performance penalty due to the high number of lengthy page faults on the target machine. There are several features of Post-Copy strategy. Preparation Time: This is the time between initiating migration and transferring the Virtual Machines processor state to the target node, during which the Virtual Machine continues to execute and dirty its memory. For pre-copy, this time includes the entire iterative memory copying phase, whereas it is negligible for post-copy. Downtime: This is time during which the migrating Virtual Machines execution is stopped. At the minimum this includes the transfer of processor state. For pre-copy, this transfer

33 23 also includes any remaining dirty pages. For post-copy this includes other minimal execution state, if any, needed by the Virtual Machine to start at the target. Resume time: This is the time between resuming the Virtual Machines execution at the target and the end of migration altogether, at which point all dependencies on the source must be eliminated. For pre-copy, one needs only to re-schedule the target Virtual Machine and destroy and destroy the source copy. On the other hand, majority of post-copy approach operates in this period. Page Transferred: This is the total count of memory pages transferred, including duplicates, across all of the above time periods. Pre-copy transfers most of its pages during preparation time, whereas post-copy transfers most during resume time. Total Migration Time: This is the sum of all the above times from start to finish. Total time is important because it a ects the release of resources on both participating nodes as well as within the Virtual Machines on both nodes. Until the completion of migration, we cannot free the source Virtual Machines memory. Application Degradation: This is the extent to which migration slows down the applications executing within the Virtual Machine. Pre-copy must track dirtied pages by trapping write accesses to each page, which significantly slows down write-intensive workloads. Similarly, post-copy requires the servicing of major network faults generated at the target, which also slows down Virtual Machine workloads. Figure 3.2 provides a high-level contrast of how di erent stages of pre-copy and post-copy relate to each other. Although post-copy does not need to stop the Virtual Machine, it su ers from frequency of page fault and data transfer. In fact post-copy transfers almost the

34 24 Figure 3.2. Timeline for Pre-Copy vs. Post-Copy. same volume of data from source to destination. 3.3 Live Migration Based on Full System Trace and Replay Unlike memory pre-copy algorithms, this method employs the target hosts computation capability to synchronize the migrated Virtual Machines state [17]. What they copied is the execution log of the source Virtual Machine but not the dirty memory pages, and this may greatly decrease the amount of data transferred while synchronizing the two Virtual Machines running state. This approach reduces the downtime by combining a bounded iterative log transferring phase with a typically short stop-and-copy phase. By iterative they mean that synchronization occurs in rounds, in which the log files to be transferred during round n are those generated during round n-1. After several rounds of iteration, the last log file transferred in

35 25 the stop-and-copy phase is reduced to a negligible size so that the downtime can be decreased to an unperceived degree. Figure 3.3. Process of Trace and Replay Migration. Figure 3.3 shows the whole process of migrating a running Virtual Machine from host A to host B. They view the migration process as a transactional interaction between the two hosts involved the following phases. Stage-0 Initialization: atargethostwithsu cientresourcesisselectedtoguaranteetherequirementof

36 26 receiving migration. A good choice may speed the upcoming migration and boost up the servers QoS. Stage-1 Reservation: host A makes a request of migrating a VM to host B. A VM container of the source VM s size should be reserved to guarantee the necessary resources are available on host B. Stage-2 Checkpointing: the VM on top of host A freezes, the system state at the current instant are saved to an image file in a copy-on-write fashion. After checkpointing, the source VM continues to run as though nothing had happened. Stage-3 Iterative Log Transferring: during the first round of transferring, the checkpoint file is copied from host A to B, while the VM on host A is continuously running and non-deterministic system events are recorded in a log file. Subsequent iterations copy the log file generated during the previous transfer round. At the same time, host B is replaying with the received log files once it had recovered from the checkpoint. As the log is transferred much faster than the log generated, this iterative process is convergent. Stage-4 Waiting-and-Chasing: after several rounds of iteration, when the log file generated during the previous transfer round is reduced to a specified size, host A inquires B whether the stop-and-copy phase can be executed soon, if the resumed VM on host B does not replay fast enough, the cumulative unused log file size on host B is still larger than threshold at this time, host B should inform host A to postpone the stop-and-copy phase until the log is used up on host B. The iterative log transferring should be continuously performed till the size of unconsumed log at host B is reduced to threshold. As the log replay speed on host B is faster than log generated speed on host A, the

37 27 migrating VM on host B would chase up the running state of the source VM finally. Stage-5 Stop-and-Copy: the source VM is suspended and the remaining log file is transferred to host B. After the last log file is replayed, there is consistent replica of the VM at both A and B. Stage-6 Commitment: host B informs A that it has successfully synchronized their running states. Host A acknowledges this message as commitment of the migration transaction, and then all its network tra c is redirected to host B. However, CR/TR-Motion is valid only when the log replay rate is larger than the log growth rate. The inequality between source and target nodes limits application scope of live VM migration in clusters. 3.4 Live Migration with Adaptive Memory Compression Live migration with adaptive memory compression is a memory-compressionbased VM migration approach that first uses memory compression to provide fast, stable virtual machine migration, while guaranteeing the virtual machine services to be slightly a ected [18]. Based on memory page characteristics, they design an adaptive zero-aware compression algorithm for balancing the performance and the cost of virtual machine migration. Pages are quickly compressed in batches on the source and exactly recovered on the target. Experiment demonstrates that compared with Xen, the system can significantly reduce 27.1% of downtime, 32%

38 28 of total migration time and 68.8% of total transferred data on average. This approach presents that most of memory pages polarize into completely opposite status: high word-similarity and low word-similarity. Then they extend this test to VMs with other representative workloads. As figure 3.4 shows, a majority of memory pages have more than 75% similarity or less than 25% similarity. For high word-similarity pages, they can exploit a simply but very fast compression algorithm based on strong data regularities, which achieves win-win e ect. In addition, they observe that memory data contain a large proportion of zeros. For pages containing large numbers of zero-bytes, they design a simple but pretty fast algorithm to achieve high compression. Figure 3.4. Memory Page Characteristics Analysis. 3.5 Delta Compression Techniques for E cient Live Migration The delta compression live migration is implemented as a modification to the KVM hypervisor. In this approach, they use a XOR binary RLE (XBRLE) live migration algorithm in order to increase migration throughput and thus reduce downtime [3]. When transferring a page, the source checks if a previous version of the page exists in the cache. If this is the case, a delta page between the new

39 29 version and the cached version is created using XOR. The delta page is compressed using XBRLE and a delta compression flag is set in the page header. Finally, the cache is updated and the compressed page is transferred. On the destination side, if the delta compression flag is set for a page, the delta page is decompressed and the new version of the page is reconstructed from the delta page and the destination s previous version of the page using XOR. A schematic overview of the delta compression scheme on the source and destination side respectively can be found in Figure 3.5 and Figure 3.6. The algorithm is implemented as modification to the user space code of the qemu-kvm hypervisor. 3.6 Summary Through previous sections discussion, we can find that pre-copy and post-copy approach don t adopt any data compression strategy. They simply migrate whole memory blocks from source to destination without any data checking. Approach and 3.5 proved that reducing transferred data can significantly reduce downtime and total migration time and thus leads to a more e cient live migration. Although these approaches use various compression methods to decrease the transferred data size, they focus on the whole memory without analysis of memory changing rate. However, if checking the memory changing rate and finding out the active memory and relatively stable memory, we can only transfer frequently changed memory blocks through live migration. The relatively stable memory can be merged into virtual machine image file through SAN network that

40 30 Figure 3.5. Source Side Delta Compression Scheme. has far better bandwidth and latency than normal network. The target machine can directly fetch these memory data back during its startup process. In this way, the size of memory blocks migrated during live migration can be directly reduced. To compensate the limitation of existing live migration methods, we propose amethodthatseparatesthememoryblocksintoactivememoryandrelatively stable memory. Active memory means these memory blocks are changed frequently.

41 31 Figure 3.6. Destination Side Delta Compression Scheme. Relatively stable memory means these memory blocks are not changed in a long time. To accomplish this purpose, we transparently intercept I/O request and maintain an up-to-data changing record of memory page table. Thus, through these records, we can categorize memory blocks by their changing rate. Even if the relatively stable memory shipped into virtual machine image file is dirtied by certain operation, this problem can be easily solved by traditional post-copy approach through a network page fault.

42 Chapter 4 Implementation In this chapter, we describe the detailed description of proposed method. Next we describe the implementation of EPT violation interception that logs new EPT record and re-mapping record. We describe the implementation of dirty page detection mechanism that must be made to log dirty memory blocks in cache. Finally, we describe the memory analysis method that generates the result of RSM and Inactive Memory Rate (IMR). 4.1 Method Description In this thesis, the goal of our proposed method is to reduce the amount of memory data shipped through analyzing the Virtual Machine memory page table and dividing the memory data into two general parts (active memory and relatively stable memory). Instead of migrating all Virtual Machine memory data through live migration, we only migrate the active memory data. The relatively stable

43 33 memory data can be merged into Virtual Machine image file before live migration. Thus, during startup process, the target machine can directly load relatively stable memory data into its physical memory through image file stored in shared storage. This can considerably reduce the amount of data sent between two hosts involved in the live migration and has the potential to shorten the total migration time. The whole process of this method is showed in Figure 1.2. We analyzed the shipped memory data and found that we can classify the shipped memory data into 3 types: new allocation records, re-mapping records and dirty pages. i) New EPT Records: When EPT can t find a mapping relationship between Guest Physical Address (GPA) and Host Physical Address (HPA), EPT will generate one exception. This exception will be handled by MMU that will create a new mapping record in EPT. ii) Re-Mapping records: When changing the mapping relationship between GPA and HPA, EPT will also generate one exception. MMU will handle this exception and update related EPT records. iii) Dirty Pages: Any modified pages, residing in the bu er cache, not yet flushed to memory and disk. Or any uncommitted data residing in bu er cache. Because of Copy-On-Write, in virtual memory operating systems, the amount of physical memory allocated for the process does not increase until data is written. This is typically done only for larger allocation. For better memory analysis, we also divide virtual machine running status into

34 two phases: initialization phase and stable running phase. i) Initialization Phase: In this phase, the virtual machine is starting up. It initialize every module, service and process.

44 34 two phases: initialization phase and stable running phase. i) Initialization Phase: In this phase, the virtual machine is starting up. It initialize every module, service and process. The memory changing rate at this time is extremely high. So, calculating memory changing rate at this time is less useful for our research. ii) Stable Running Phase: In this phase, all processes have already startup and kept running. There is no initialization task any more. So, we can think the memory changing of Apache Web Server is the reaction of client access requests. In Chapter-5, I gave a detailed explanation and experimental result of how we can divide running status into these two phases. Figure 4.1. Memory Changing During Two Phases. We define Relatively Stable Memory (RSM) and Inactive Memory Rate (IMR)

45 35 by a set of subtypes, which can be obtained by the following formula using subtypes in table 4.1. RelativelyStableM emory(rsm) =A B C (1) InactiveM emoryrate(imr)= RSM/(A + D) This formula means that first we can get total number of unmodified and clean pages in EPT records by A-B-C, which is call RSM. Secondly, A+D means the total number of pages in EPT. Finally, using the first result divided by the second result, we get the IMR. The reason why we call it RSM is that RSM might change in the future. However, even if RSM changes, we can use a hybrid live migration approach to solve this problem. The hybrid approach was first described for process migration. But we can use it in our live migration process. It works by doing a single pre-copy round in the preparation phase of migration. During this time, the Virtual Machine continues running at the source while all its memory pages are copied to the target host. After just one iteration, the Virtual Machine is suspended and all active memory are copied to the target. Subsequently, the Virtual Machine is resumed at target and an approach of post-copy via pre-paging kicks in. Table 4.1. Types Related to Memory Changing. New EPT Records Re-Mapping Records Dirty Pages Initialization Phase A B C Stable Running Phase D E F

46 36 The goal of post-copy via pre-paging is to anticipate the occurrence of major faults in advance and adapt the page pushing sequence to better reflect the Virtual Machines memory access pattern. While it is impossible to predict the Virtual Machines exact faulting behavior, this approach works by using the faulting addresses as hints to estimate the spatial locality of the Virtual Machines memory access pattern. The pre-paging component then shifts the transmission window of the pages to be pushed such that the current page fault location falls within the windows. This increase the probability that pushes pages would be the ones accessed by the Virtual Machine in the near future, reducing the number of major faults. However, the main challenge is to develop the table 4.1. After RSM is defined, we can directly classify the memory data and merge RSM into virtual machines image file. We also use this knowledge during live migration. When some memory data needed to be shipped, if it is the part of the RSM, we don t ship it. 4.2 Relationship of Live Migration E ciency and Memory Reduction In this part, we discuss the relationship of live migration e ciency and memory reduction. There are two main approaches pre-copy, post-copy. I will compare each of them with our proposed approach to show that memory changing rate analysis is aindependentmethodthatcanbringpotentialbenefit,nomatterwhichmigration

47 37 approach systems are using. First some notations are defined as following: R page :theaveragegrowthrateofdirtypagesinthemigratedvirtualmachine. R tran : the average page transfer rate, which denotes available network bandwidth for transferring pages from the source node to the target. R ana : the average memory analysis rate. ana :theaveragereductionratio,thepercentageoffreeddata. V tms :thetotalmemorysizeofvirtualmachine. Pre-Copy We will compare standard pre-copy approach and pre-copy with reducing memory by memory changing rate analysis approach here. For pre-copy with reducing memory approach, the elapsed time in all rounds of live migration are respectively represented by vector T =< t 0,t 1,...,t n 1 >. For standard pre-copy, the corresponding sequences of transferred data size and the elapsed time are defined as vector T 0 =<t 0 0,t 0 1,...,t 0 n 1 >. Let =1 ana t 0 = V tms R tran, t 0 0 = V tms R tran t 1 = R paget 0 R tran, t 0 1 = R paget 0 0 R tran

48 38... t n = R paget n 1 R tran, t 0 n = R paget 0 n 1 R tran (2) Let = Rpage R tran,therearetwocaseslistedbelow: i) <1, which means that dirty pages rate is less than network transfer rate. In the case, live migration process would be terminated when the amount of dirty page data in certain iteration is less than or equal to the threshold V thd. From equation set (2), we get the following equations: t n = V tms n R page Rtran n+1 (3) t 0 m = V tmsr page m R m+1 tran (4) If pre-copy with reducing memory approach runs n rounds to reach the threshold and standard pre-copy approach runs m rounds, we can get the equation: t n = t 0 m.fromequation(3)and(4),wecandeducetheequation: R page n R n+1 tran = R page m R m+1 tran (5) Since is greater than 0 and less than 1, we get m < n.because <1, we can deduce

49 39 m<n The above inequation implies that when dirty page rate is less than available network bandwidth for migration, pre-copy with reducing memory approach has faster convergence rate to reach the close point of live migration iterations than standard pre-copy approach. ii) >1. In this case, live migration process is terminated as soon as it meets the maximum transfer time. Because of the shrinked total amount of transferred data, the downtime is also shortened. Post-Copy Post-copy ensures that each memory pages is transferred at most once, no matter through active push or network page fault, thus avoiding the duplicate overhead of pre-copy. The elapsed time of data transfer for standard post-copy approach is T 1,forpost-copywithreducingmemoryapproachisT 2. T 1 = V tms R tran, T 2 = V tms R tran All memory blocks (V tms )needtobemigratedthroughnetworklivecopyin standard post-copy approach, while post-copy with reducing memory approach only migrate active memory (V tms ) through live migration. Thus, the latter approach reduces the resume time, which is the time between resuming the Virtual Machine execution at the target and the end of migration altogether. Considering

50 40 prepare-time and downtime keep the same in both approaches, post-copy with reducing memory will also decrease total migration time. 4.3 EPT Violation Interception For New EPT Records and Re-Mapping Records, we can get them through EPT violation interception. When an address translation is required, the 2D page walk hardware traverses the guest page table to map guest virtual address to guest physical address, with each guest physical address requiring a nested page table walk to obtain the system physical address. Figure 4.2 shows the steps required for a 2D walk. The numbers within each circle or square in Figure 4.2 show the ordering of memory references that take place during an end-to-end 2D page walk. The final SPA indicates a system memory reference to the referenced datum once the translation has been created. The boxes represent the stages of the guest page walk, and the circles represent the stages of the nested page walk. Each circle or square contains a label showing the level of the page walk, gl 4 to gl 1 for guest page table walk and nl 4 to nl 1 for nested page table walk. Guest physical address are indicated by dotted lines. In nested paging, each gl n entry cannot be read directly using a guest physical address; a nested page table walk must translate the guest physical address before the gl n entry can be read. The guest physical address for gl 4 serves as an input to a recursive call to the page table walker, this time with ncr3 asthebaseof the page table. The page table walker reads the four nl n,gl 4 entries to translate

51 41 the guest physical address into a system physical address that can be used to read the desired gl 4 entry. The walk proceeds to the next level of the guest page table, which corresponds to the second row in Figure 4.2, and again reads four nl n,gl 3 entries to find the required system physical address. This portion of the walk repeats for gl 2 and gl 1.ThegL 1 entry at step 20 determines the guest physical address of the base of the guest data page. At this point, the guest page table has been traversed, but one final nested page walk (steps 21-24) is required to translate the guest physical address of the datum to a usable system physical address. Figure D Walk. Every time there is an EPT violation, Linux kernel will capture one EPT violation exception. Either new allocation or EPT remapping will call mmu set spte function. The di erence of these two operations can be recognized by their sptep

52 42 variable. If sptep equals zero, it means a new allocation operation. Otherwise, its aeptremappingoperation. Figure 4.3. The Process of EPT Violation Handling. The whole process of EPT violation is illustrated as Figure 4.3. If system cannot find mapping record from EPT, it generates a EPT violation exception which will be captured and handled by kvm mmu page fault. The left subtree of Figure 4.3 shows how KVM creates and modifies the EPT table structure. Figure 4.4. For Each Shadow Entry. For each request, MMU checks mapping relationship from each level of EPT table. The details of this techniques have already mentioned in Figure 4.2. Figure 4.3 shows the structure of shadow entry, which includes two main function

53 43 shadow walk okay and shadow walk next. Function shadow walk okay is for checking record in current level table. If the level of current table is less than 1 or the last level, it returns false. Otherwise, it return the base address of next level or the final address. Function shadow walk next is for indexing next level EPT table. Both functions are showed in Figure 4.5 and 4.6. Figure 4.5. Shadow Walk Okay. If KVM can t find the record through 2-D walk or find the record needed to be changed, function mmu set spte will be called to create or modify EPT record entries. Thus, through intercepting page fault function call from mmu set spte we can get all EPT operations, such like allocating a new record or modifying a

54 44 Figure 4.6. Shadow Walk Next. existing record. Here are two examples for two di erent EPT operations. (1) Jan 23 10:21:26 spte: 0 write_fault: 2 gfn: ec0 (2) Jan 23 10:21:26 spte: 1eb42fc77 write_fault: 2 gfn: ebe The first example is a new allocation operation, because the SPTE (shadow

55 45 pte) number is zero, which means there is no existing physical page mapping with guest frame number (gfn) ec0. However, the second example is a remapping operation, because it has already showed its old mapping that is between gfn ebe and spte 1eb42fc77, which is one existing shadow pte. Because KVM is running in Linux kernel mode, we redirected these output to /var/log/kern.log for memory changing rate analysis by the program running in user mode. 4.4 Dirty Page Detection Dirty Page means any modified pages, residing in the bu er cache, not yet flushed to memory and disk. In other words, they are uncommitted data residing in bu er cache. If we want to detect all dirty pages, we must research the basic memory structure first. This basic structure is kvm memory slot. KVM uses this structure for each memory frame to memorize its address mapping relationship and all its properties. Each memory slot maintains its own data structure. Figure 4.9 shows the definition of kvm memory slot. There is one bitmap pointer called *dirty bitmap in kvm memory slot data structure. Through iterating this bitmap we can get the number of dirty guest frame. So, the only problem is how and when we can iterate this bitmap. Linux kernel regularly runs some processes to check all dirty pages and verify if the number of dirty pages is oversize. If it is, kernel will commit them. Through our research, KVM periodically invokes a function called kvm vm ioctl get dirty log. This is used for copying all dirty page record to some log files in user space and

56 46 getting and clearing the log of dirty pages in a slot. That means KVM can clean dirty pages for each memory slot through this function when Linux kernel want to do it. This function is showed in Figure 4.10 and Our approach is that gfn of dirty page is redirected to /var/log/kern.log as the way of EPT violation interception in the last section when function kvm vm ioctl get dirty log is invoked, because only when it s running can dirty pages be cleaned by kernel. Between two times function calls, dirty pages will not changes. So, this approach does not bring us stale records. Here is one example of our customized output in system log file. Jan 23 10:21:25 dirtygfn: gfn: RSM Calculation As I mentioned in section 4.1, we calculated the Relatively Stable Memory (RSM) by equation (1). Through approach explained in section 4.3 and 4.4, we can get all six types of memory pages in Table 4.1 and count the number of each type. The method we used is that our program monitors /var/log/kern.log. According to time stamp and key words in log file, the program can generate the real time RSM result. Figure 4.12 shows the core part of the program. In this program, we use several hashset to store di erent type memory frames. They are A (1st-phase-new), B (1st-phase-remap), C (1st-phase-dirty), D (2ndphase-new), E (2nd-phase-remap) and F (2nd-phase-dirty), which like the defini-

57 47 tion in Table 4.1. In the first phase, there is no confusion. If the keyword is sptmap and the value of spte is zero, it means this is a new record. If the keyword is sptmap but the value of spte is not zero, it means this is a remap operation. Or if the keyword is dirtygfn, it means this is a dirty page. All these gfn are put into ept gfn add set ept gfn modify set and dirty page set according to their own type. In the second phase, if its type is D, we don t need to care. But, if its type is E or F, we need to verify if it s already existed in the first phase. If yes, it means this operation is modifying or dirtying a existing page allocated in the first phase. Too many operations like this hurt RSM result. Thus, we need to put these operations into other sets for final calculation. At last, we calculate the RSM result by the equation (1) discussed in section 4.1.

58 Figure 4.7. MMU Set SPTE (1). 48

59 Figure 4.8. MMU Set SPTE (2). 49

60 50 Figure 4.9. KVM Memory Slot. Figure Get Dirty Log (1).

61 Figure Get Dirty Log (2). 51

62 Figure RSM Calculation Code (1). 52

63 Figure RSM Calculation Code (2). 53

64 Chapter 5 Result 5.1 Experiment Setup We carry out our experiments on one HP Envy 15t machine with 4 cores and 8 threads Intel 64-bit processor, 2.4 GHz (Max Turbo Frequency is 3.4 GHz). This processor supports Virtualization Technology (VT-x) and VT-x with Extended Page Tables (EPT). The output of /proc/cpuinfo is shown in Figure 5.1. Because all threads have the same configuration, I listed only one here. The host machine is running Ubuntu LTS. We get the latest version of KVM and Linux kernel from git://git.kernel.org/pub/scm/virt/kvm/kvm.git and rebuild the kernel to support Kernel-based Virtual Machine and KVM for Intel processors and IPTable. The kernel version is The KVM source code is patched with our modifications descried in our previous chapter. Qemu-kvm and Libvert are complied and implemented for user space virtual machine management and migration. A Apache server is installed on our experimental host. It is used for

65 55 Figure 5.1. Experimental Host Configuration. simulating web server service. In the default virtual directory of Apache server, we put 50 files, the average size of which is 20MB. Moreover, for the purpose of pressure test, we also write a program to simulate concurrent client requests. In this pressure test program, we can set number of concurrent clients, delay of each request, number of repetition and so forth. When the simulation finishes, it will

66 56 generate a result report showed in Figure 5.2. Figure 5.2. Pressure Test Programm Result. 5.2 Boundary between Two Running Status We have discussed the approach of dividing virtual machine running status into two phases in previous chapters: initialization phase and stable running phase. We conducted experiments to measure the impact of number of EPT records by changing the boundary between two running status from 1 minutes to 10 minutes. Through Figure 5.3 we can easily find that system are very busy for allocating memory blocks in the first 3 minutes. It totally allocates approximate 2,500 memory frames. From 3rd minute to 8th minute, the curve is relatively stable. The number of EPT record generated in each minute is nearly zero, because operating

67 57 system has already finished it startup and initialization process. However, after 8 minutes, there is a big increase. This might be caused by some scheduled tasks, which is outside the scope of this research. Thus, through this experimental result, we decide to define the first 5 minutes as initialization phase and the period after 5th minute as stable running phase. Figure 5.3. Number of New EPT Records by Increasing Time. 5.3 RSM Result In this section, we introduce the evaluation result of RSM result. The system memory changing rate under five di erent pressure were tested. As I mentioned in previous sections, there are 50 files under Apache virtual directory for client

68 58 accessing. In the first test, we simulate 300 concurrent users accessing files of 200MB in virtual directory. In the following test, we change the file size to 400MB, 600MB, 800MB and 1000MB. Every user repeats its access operation 1000 times. Because of various total file size, the memory pressure also changes accordingly. Each test lasts for 60 minutes. Table 5.1. Experimental Result under 5 Di erent Pressure Tests. 200M 400M 600M 800M 1000M RSM EPT A EPT B Memory Changing Rate 0.66% 0.95% 1.12% 0.54% 0.99% Inactive Memory Rate 13.21% 6.41% 7.66% 8.63% 3.61% Table 5.1 shows the experimental result under 5 di erent pressure tests. EPT A is EPT record created in the 2nd phase. EPT BisEPTrecordmodifiedafterit was created in the 1st phase. Memory Changing Rate (MCR) is the result of EPT B. MCR shows the changing degree of memory frames allocated in the RSM+EPT B 1st phase. Inactive Memory Rate (IMR) is the result of RSM.IMR EPT A+EPT B+RSM shows the percentage of inactive memory in whole memory. We want this value as big as possible. Because of these definition, we can know that EPT A+EPT B + RSM is the total number of EPT records and EPT B + RSM is total number of EPT records created in 1st phase. There are several observations through the result. i) Through Figure 5.4, we find that the growth of EPT Aisalmostlinear. That is, as the increasing pressure of memory, the number of EPT Aisalsoincreasing. This phenomenon is reasonable, because system must allocate more memory frames

69 59 to fulfill these file access requests. ii) We already known that the total memory frames allocated in 1st phase should be very constant through the discussion of previous chapters from a theoretical perspective. The result showed in Figure 5.4 also proved this. Moreover, we found that the the number of RSM and EPT Barestableaswell. Therange of the value of RSM is basically from 10,000 to 30,000. And the range of EPT Bis from 100 to 200. One possible explanation is that memory frames allocated during 1sh phase are used for system purpose. Thus, any further operation not related to system purpose will not impact this part of memory. Only when a certain operation want to modify memory slots used by system services, the EPT Bmay change. But, most operations of Apache server are not related to this. iii) In Figure 5.5, we can find that IMR decreases as increasing memory pressure. This is mainly caused by the increasing number of EPT A. However, the result of MCR is pretty flat and small. The value is changing from to It reveals that only a small portion of memory frames allocated in the 1st phase will be modified in the future, no matter how great pressure the system is undertaking. 5.4 Conclusion and Future Work In this work, I have demonstrated that the RSM can be used to improve the performance of Apache web server live migration. Our proposed approach measures RSM and MCR before live migration in order to divide memory frames into active part and relatively stable part. This research focuses on the Apache static web

70 60 Figure 5.4. EPT Record Changing Result. server, which has roughly constant number of write operations. Our experiments discussed in the previous section clearly demonstrate the following: i) The Apache web server memory pressure is related to IMR. The less pressure the better IMR result. In our experimental result, the average IMR of 5 various test is 7.89%, which means 7.89% of whole allocated memory can be merged into Virtual Machine image before live migration. Because of the high speed connection and bandwidth of SAN, the latency of put and get data through SAN is very small. If we ignore these latencies, the performance of Apace web server live migration can be improved. ii) Moreover, although IMR decreases as increasing memory pressure, the MCR

71 61 Figure 5.5. MCR and IMR. and total EPT in 1st phase are quite stable. Thus, we can get the conclusion from our experimental result that, for one Apache web server, there will always be 10,000 to 30,000 RSM memory frames. We ve already known that the size of a memory frame is 4KB. So, through proposed method we can reduce almost 120MB memory during live migration. These RSM can potentially help us to reduce the live migration duration. iii) Although several existing approaches with data compression can reduce nearly 60% transferred data, our proposed method are parallel technique with them. Because our approach executes memory analysis before live migration and

62 Figure 5.6. Memory Changing Rate in 60 Minutes under 200M Pressure Test. theirs compress transferred data during live migration. Thus, our approach can be easily merged into existing methods.

72 62 Figure 5.6. Memory Changing Rate in 60 Minutes under 200M Pressure Test. theirs compress transferred data during live migration. Thus, our approach can be easily merged into existing methods. Lastly, possible future work of this topic could include combining this technique into live migration process as I mentioned in Figure 1.1. Now, we ve already found this potential improvement through memory analysis. In the next step, we can

Advanced Operating Systems (CS 202) Virtualization

Advanced Operating Systems (CS 202) Virtualization Virtualization One of the natural consequences of the extensibility research we discussed What is virtualization and what are the benefits? 2 Virtualization