Eran Borovik and Ben-Ami Yassour Abstract. 1 Introduction

Size: px
Start display at page:

Download "Eran Borovik and Ben-Ami Yassour Abstract. 1 Introduction"

Transcription

1 Adding Advanced Storage Controller Functionality via Low-Overhead Virtualization Muli Ben-Yehuda, Michael Factor, Eran Rom, and Avishay Traeger IBM Research Haifa Eran Borovik and Ben-Ami Yassour Abstract Historically, storage controllers have been extended by integrating new code, e.g., file serving, database processing, deduplication, etc., into an existing base. This integration leads to complexity, co-dependency and instability of both the original and new functions. Hypervisors are a known mechanism to isolate different functions. However, to enable extending a storage controller by providing new functions in a virtual machine (VM), the virtualization overhead must be negligible, which is not the case in a straightforward implementation. This paper demonstrates a set of mechanisms and techniques that achieve near zero runtime performance overhead for using virtualization in the context of a storage system. 1 Introduction Additional functions, such as file serving or database, are often added to existing storage systems to meet new requirements. Historically, this has been done via code integration, or by running the new function on a gateway or virtual storage appliance(vsa[37]). Code integration generally performs best. However, the new function must run on the same OS version, the controller s main functionality is vulnerable to bugs due to lack of isolation, resource management is complicated for software which assumes a dedicated system, and development complexity increases in particular when the new function already exists as independent software. The gateway approach offers isolation but adds both latency and hardware costs. A hypervisor can isolate the new function, allow for differing OS versions, and simplify development. However, until now the high performance overhead of virtualization (in particular virtualized I/O) has made this approach impractical. In this paper, we show how to use server-based virtualization to integrate new functions into a storage system with near zero performance cost. Our approach is in line with the VSA approach, but we run the VM directly on the storage system. While our work was done using KVM [14], our insights are not KVM-specific. We do take advantage of the factthatkvm usesanasymmetricmodelinwhichsome of the code is virtualized (the new features) while other code (the original storage system) runs on bare metal, unaware of the existence of the hypervisor. There are three sources of performance overhead. Base overheads include aspects such as virtual memory management or process switching. External communication with storage clients is important when the new function is a filter ontopoftheoriginalstoragesystem,e.g.,afile server. Finally, internal communication overheads are incurred to tie the new function to the original controller. To reduce base overhead, we use two main techniques. First, westatically allocatecpu coresto theguesttoensure that the function has sufficient resources. Second, we statically allocate memory for the VM, backing that area with larger pages to reduce translation overheads. The straightforward implementation of external communication is expensive because the hypervisor intervenes when physical events occur (e.g., interrupts or device accesses). Each such intervention entails an expensive exit from the guest code to the hypervisor. The highest-performing approach for reducing this overhead is device assignment, which eliminates exits for device access. Thus, to reduce these costs, we assign the network device directly to the guest using an SR-IOVenabled adapter [23] which allows the guest to send requests directly to the device. To eliminate exits for interrupts, we use instead of interrupts, a well-known technique in storage systems. To reduce the cost of internal communication, we modified KVM s para-virtual block driver to poll as well, eliminating exits due to PIOs and interrupt injections. This provides for a fast, exit-less, zero-copy transport. By using these techniques, we show no measurable difference in network latency between bare metal and virtualized I/O and under 5% difference in throughput. For internal communication, micro-benchmarks show 6.6µs latency overhead, read throughput of 357K IOPS, and write throughput of 284K IOPS; roughly seven times better than a base KVM implementation. In addition, an I/O intensive filer workload running in KVM incurs less than.4% runtime performance overhead compared to bare metal integration. Our main contributions are: a detailed, benchmark-driven analysis of virtualization overheads in a storage system context, a set of approaches to removing overheads, and a demonstration of how these approaches enable running new storage features in a VM with essentially zero runtime performance overhead.

2 Therestofthepaperisorganizedasfollows. Section2 provides background on KVM and VM I/O. We take an incremental approach to show our performance improvements; Section 3 describes the experimental environment. Sections 4 and 5 present a performance analysis and describe optimizations related to the external and internal communication interfaces, respectively. Base overheads are shown together with macro-benchmark results areinsection6. Section7describesrelatedworkandwe conclude in Section 8. 2 x86 I/O Virtualization Primer We now provide some background information on KVM (the hypervisor used in this paper) and virtual machine I/O. There are two main options for where a hypervisor resides. Type 1 hypervisors run directly on the hardware, whereas type 2 hypervisors are hosted by an OS. KVM takes a hybrid approach that combines the benefits of both. It is a Linux kernel module that leverages Intel VT-x or AMD-V CPU features for running unmodified virtual machines, thereby creating a single host kernel/hypervisor that runs both processes and virtual machines. Such a hybrid architecture allows the storage controller software to run unmodified on bare metal while also running additional functionality in virtual machines. There are three main methods for accessing I/O devices in VMs. In the first, emulation, the hypervisor emulates a specific device in software [35]. The OS runninginthevm(guestos) usesitsregulardevicedrivers to access the emulated device. This method requires no changes to the guest, but suffers from poor performance. In the second method, para-virtualization [4], the guest OS runs specialized code to cooperate with the hypervisor to reduce overheads. For example, KVM s paravirtualized drivers use virtio [26], which presents a ring buffer transport(vring) and device configuration as a PCI device. Drivers such as network, block, and video are implemented using virtio. In general, the guest OS driver places pointers to buffers on the vring and initiates I/O via a Programmed I/O (PIO) command. The hypervisor directly accesses the buffers from the guest OS s memory (zero-copy). Para-virtualized devices perform better than emulated devices, but require installing hypervisorspecific drivers in the guest OS. The third method, device assignment [6, 17, 39], gives thevmaphysicaldevicethatitcansubmiti/ostowithout the hypervisor s involvement. An I/O Memory Management Unit(IOMMU) provides address translation and memory protection [6, 7, 38]. Interrupts, however, are routed to the guest OS via the hypervisor. Assigning a device to the VM means that no other OS can access it (including the hypervisor or other guests). However, technologies such as Single Root I/O Virtualization(SR- IOV)[23] allow devices to be assigned to multiple OSs. 3 Experimental Setup We take an incremental approach to showing how to eliminate the virtualization overheads. For our experiments we used two servers, each with two quadcore 2.93GHz EPT-enabled Intel Xeon 55 processors, 16GB of RAM and an Emulex OneConnect 1Gb Ethernet adapter. The servers were connected with a 1Gb cable. One serveractedas a loadgeneratorandthe other was our(emulated) storage controller platform. We used RHEL 5.4 with the RedHat el5 kernel for both the load server and the guest. The controller server used the RedHat kernel for bare metal runs and Ubuntu 9.1 with a vanilla kernel for KVM runs. The newer kernel was necessary for running KVM. The controller server was run with four cores enabled, unless otherwise specified. For VM-based experiments, two cores and 2GB of memory were assigned to the guest; all four cores were used by the host in the bare metal cases. We used an 8GB ramdisk for the storage back-end in the experiments described in Section 5 and 6. This allowed us to measure I/O performance without physical disks becoming the bottleneck. We accessed the ramdisk via a loopback device, which allowed us to assign disk I/O handling to specific cores, similar to the way a storage controller functions. All results shown are the averages of at least 5 runs, with standard deviations below 5%. 4 Network Communication Performance Enabling the guest to interact with the outside world requiresi/o access. As discussed in Section 2, each of the three common approaches to I/O virtualization has benefits and drawbacks. We identified device assignment the best performing option as the most suitable approach for adding new functionality to storage controllers. KVM s initial device assignment implementation, however, did not provide the necessary performance. In the remainder of this section, we analyze device assignment and discuss a set of optimizations which allowed us to achieve near bare-metal performance. Virtualization overhead is mainly due to events that are trapped by the hypervisor, causing costly exits [1, 5, 16]. Theoverheadisafactorofthefrequencyofexitsandthe timeittakesthehypervisortohandletheexitandresume running the guest. To examine the performance impact of virtualization for our intended use and ways to reduce it, we focused on networking micro-benchmarks. Our goalistominimizetheamountoftimethatthehypervisor needstorun,byminimizingthenumberofexits. Thefirsttechniquethatweusedtoimprovetheguest s performance is related to the handling of the hlt (halt) andmwaitx86instructions. WhentheOSdoesnothave any work to do it can call these instructions to enter a

3 power saving mode. Most hypervisors will trap these commands and will run other tasks on the core. In our case, however, the new function should always run. We therefore instructed the guest OS to enter an idle loop when there is no work to be done by enabling a kernel boot parameter (idle=poll). This improves performance, as the guest is always running. The second technique that we used is related to interrupt handling. Most of the guest exits related to device assignment are caused by interrupts[2, 5, 18]. Every external interrupt causes at least two guest exits: first, when the interrupt arrives(causing the hypervisor to gain control and to inject the interruptto the guest) and when the guest signals completion of the interrupt handling(causing the host to gain control and to emulate the completionfortheguest). Theguestcanconfiguretheadapterto use two different interrupt-delivery modes: MSI, which is the newer message based interrupt protocol, or the legacy INTX protocol. The KVM implementation we used incurred additional overhead when using MSI interrupts, due to additional exits for masking and unmasking adapter interrupts. Since most of the virtualization overhead comes from interrupts, our approach is to run the adapter in rather than interrupt-driven mode. In Linux today, most network adapters use NAPI [3, 31], a hybrid approach to reducing interrupt overhead which switches between and interrupt-driven operation depending on the network traffic. However, even with NAPI, we have seen interrupt rates of 7K interrupts persecond. Sincesuchahighinterruptratecanincurprohibitive overhead and interrupts are not necessary for our intended use case, we decided to forgo interrupts and use. Our driver creates a new thread for the functions. The adapter we use has three types of events: packet received, packet sent, and command completion. Since there is no way to know when a packet will be received, our driver continuously polls for packets received; packet sent and command completion indications are handled by the same thread every so often. Using a constantly thread means that we dedicate most of a core for this functionality. While this might seem expensive from the resources perspective, it proved critical to achieve the desired performance. A single core could also be used to poll multiple devices by integrating their threads into a single thread, or by scheduling different threads on the same core. We did not experiment with this configuration. Next we evaluate the performance of the driver using network micro-benchmarks. Table 1 depicts the averagedurationtimeofapingfloodcommandgoingfrom aclientmachinetothesystemundertest. Thesystemunder test replies to pingsusing our drivereither in modeor in INTX mode. The driverrunseitherin the host (bare-metal), or in the guest with halt disabled, Transactions/second Throughput (Gb/s) Bare-metal halt INTX Polling Table 1: Ping average latency (µs) host msi guest msi guest intx host poll guest poll Number of netperf processes Figure 1: netperf request-response throughput Message size (bytes) host msi guest msi guest intx host poll guest poll Figure 2: netperf TCP send throughput orintheguestwith haltenabled(guesthalt). Figure 1 shows the results for several netperf request-response configurations, measuring round-trip time using1bytepackets. guestmsi andguestintx stand for the guest using MSI and INTX interrupt delivery, respectively. host msi stands for the host using interrupts inmsimode;guestpollandhostpollstandfortheguest and host using mode, respectively. As expected, mode achieves better performance than interrupt modeinthehost(i.e,onbaremetal). Sinceinguestmode thecost ofinterruptsis muchhigher,thegainfromusing is more significant than in the bare-metal case. Using MSI interrupts in guest mode has significant impact on the performance with this KVM version since there are frequent exits due to interrupt masking calls by the guest. Figure 2 shows the results of a single-threaded netperf send TCP throughput test (system under test is sending) in the same configurations as the previous figure: host using and INTX interrupts, guest using

4 5 µs 15.9 µs vcpu Thread Core Physical Execution Post 24% [12µs] Execution Pre 18.4% [9.2µs] vcpu Thread Benchmark Layer Application 2.5% [.4µs] Layer Block 4.4% [.7µs] FE virtio-block 3.1% [.5µs] Wait DIO 83.6% [13.3µs] Polling Thread Host Core 1 Physical BE virtio-block 6.9% [1.1µs] BDRV Layer QEMU 6.9% [1.1µs] System Call AIO 35.8% [5.7µs] Completion AIO 6.3% [1µs] Main Thread QEMU Core 1 Physical vcpu Polling Core 2 Physical Completion I/O 6.9% [1.1µs] Throughput (Gb/s) host msi host poll guest poll Application Layer [1µs] 2% VFS Layer [2.4µs] 4.8% Wait [8.3µs] 16.6% DIO 2.6% [1.3µs] Layer Block 1.8% [.9µs] FE virtio-block AIO Completion [5.2µs] 1.5% Other Work [.8µs] 1.5% virtio-block BE [1.7µs] 3.4% QEMU BDRV Layer [2.2µs] 4.4% Idle Time [44µs] 88% Message size (bytes) AIO System Call [9.1µs] 18.2% Figure 3: netperf TCP receive throughput, INTX, and MSI interrupts. Here the contribution of is less noticeable, since the TCP stack batches network processing. On bare metal, provides better performance than interrupt mode. In guest mode, the advantage of is much more significant. Figure 3 shows the results of a single-threaded netperf receive TCP throughput test (system under test is receiving). Here it is surprising to see that for the bare-metal case the performance of (host poll) is less than that of interrupts (host msi). The reason is that in the netperf throughput test the sender is the bottleneck. When the receiver is working in mode, it sends many more acknowledgment packets to the sender. For example in the case of 1K messages, the receiver sends approximately 1 times as many ACKs. Since the sender is already the bottleneck, sending more ACKs generates more load on the sender, which reduces sender throughput. While this issue is noticeable for this microbenchmark, in practice, the handling time of a packet by the receiver is much larger, hence in most cases the sender is not the bottleneck. Polling achieves the same performance in the guest poll and host poll cases, which indicates that the virtualization-induced runtime overhead is negligible. To verify that the reduced performance for the receivetestisanartifactoftcp,weranthesametest using UDP. With UDP, all setups guest or bare metal, interrupts or achieve the same performance. Because the sender is the bottleneck, once the TCP ACK effect is removed, performance is not affected by the receiver s mode of operation. 5 Internal Communication Performance Of the three methods for accessing I/O devices described in Section 2, we use para-virtualization for internal communication. Para-virtualization performs better than emulated devices, and because we supply the VM image that runs in the controller, we can easily use custom drivers. Further, our goal is to transmit I/O requests to a controller process running on the host, so device assignment is less Legend: I/O Completion [1.9µs] 3.8% User Space Kernel Space Host User Space Figure 4: Unmodified KVM para-virtualized block I/O path. Physical Core VFS Layer [1µs] 6.3% Legend: User Space Poll [7µs] 44% Poll (continued) Kernel Space Poll [14.8µs] 93.1% Host User Space Figure 5: Para-virtualized block I/O path with. practical. For example, we cannot assign the drives because the storage controller must own them, and not the guest OS. One may also consider using external communication to access the controller via iscsi or Fibre Channel, but this adds unnecessary communication overheads. We use ramdisk as the backing store for our analysis to prevent the disks from dominating latencies or becomingabottleneck. Inaddition,weusedirectI/Otoprevent caching effects that mask virtualization overheads. Latencies presented are the average over 1 minutes. Section 5.1 describes the vanilla KVM para-virtualized block I/O, and Section 5.2 describes our optimizations. 5.1 KVM Para-virtualized Block I/O Figure 4 depicts the unmodified para-virtualized block I/O path in KVM, along with associated latencies for major code blocks when executing a 4KB direct I/O write request. The guest application initiates an I/O, which is handledbytheguestkernelasusual. ThedirectI/Owaitingtime(DIOWait,16.6%ofthetotal),consistsofworld

5 switches and context switches between threads inside the guest. Though we have drawn it as one block, it is interleaved with other code running on the same core. The para-virtualized block driver (virtio-block front-end) iterates over the requests in the elevator queue and places each request s I/O descriptors on the vring, a queue residing in guest memory that is accessible by the host. The driver then issues a programmed I/O(PIO) command whichcausesaworldswitchfromguesttohost. Control is transferred to the KVM kernel module to handle the exit. The post- and pre-execution times (24% and 18.4%, respectively) account for the work done by both KVM and QEMU to change contexts between the guest and QEMU process (including the exit/entry). KVM identifies the cause of the exit and, in this case, passes control to the QEMU virtio-block back-end(be). It extracts the I/O descriptors from the vring without copies and passes the requests to the block driver layer (QEMU BDRV), which initiates asynchronous I/Os to the block device. The guest may now resume execution. An event-driven dedicated QEMU thread receives I/O completions and forwards them to the virtio-block BE. The BE updates the vring with completion information andcallsuponkvm to inject an interruptintothe guest, for which KVM must initiate a world switch. When the guest resumes, its kernel handles the interrupt as normal, andthenaccessesits APIC tosignal the endofinterrupt, causing yet another exit. Locks to synchronize the two QEMU threads incur additional overhead. 5.2 Para-virtualized Block Optimizations To reduce virtualization overhead, we added a thread to QEMU as depicted in Figure 5. The thread pollsthevring(1)fori/orequestscomingfromtheguest and (2) for I/O completions coming from the host kernel. The thread invokes the virtio-block BE code on incoming I/Os and completions. This thread does not necessarily need to reside in QEMU; if the storage controller is -based, its thread may be used. As discussed in Section 4, we added a thread to the guest which polls the networking device. We utilize this same thread to poll the vring for I/O completions. When it detects an I/O completion event, it invokes the guest I/O completion stack, which would normally be called by the interrupt handler. By using on both sides of the vring, we avoid all I/O-relatedexits, and thusalso avoid all of the pre- and post-guest execution code. We also avoid locking the queue, since now only the threadaccesses it. For the 4KB direct I/O write, this improves the latency from 5µs to 15.9µs. Comparing Figures 4 and 5, we see that better utilizes the CPU for I/O-related work. Additionally, components that we didn t directly optimize(such as the VFS layer, for example) are more efficient thanks to better cache utilization and less cache pollution due to fewer context switches. We performed two additional code optimizations in QEMU to reduce latencies, whose impact is already included in the above discussion. When accessing a guest s memory, QEMU must first translate the address using a page-table like data structure. This handles cases where the guest s memory can be remapped(for example, when dealing with PCI devices). In our case, the memory layout is static, rendering the translation unnecessary. Removing unnecessary lookups improved performance by 4.6% for 4KB reads and 4.2% for 4KB writes. The second optimization is to use a memory pool for internal QEMU request structures. This saved 3% for 4KB reads and2.5%for4kbwrites. 5.3 Overall Performance Calculation AstoragecontrollerrunninganewfunctioninaVMthat uses interrupts for its internal communication would have a rather significant performance penalty. Looking at Figure 4, the corresponding storage controller implementation would look similar, except that the AIO calls would be replaced by asynchronous calls to the controller code. We consider any work done from the time the application submits the I/O until it reaches the controller to be virtualization overhead (work that would not be done if running directly on the host). In the unmodified case, the overheadis49µs(wesubtractonlythelatencyoftheapplication layer from the total). If our techniques were integrated into a controller, we would calculate the latency overhead as follows, based on Figure 5. We begin with the total, 15.9µs, and subtracttheapplicationlayer,aswedidinthepreviouscase. Further, we subtract the QEMU BDRV layer, and the AIO system call and completion, because these would be replaced by the controller code, and are therefore not considered virtualization overhead. The final overhead is therefore 7.7µs before the two QEMU optimizations, and 6.6µs after. To put the overheads in context, we estimate our performance impact on the fastest latencies published using the SPC-1 benchmark since 29 [34]. The fastest result was 13µs, and our virtualization technique would add approximately 5% overhead to this case(the baseline case would add approximately 38%). The average of the 27 controllers fastest latencies is 482µs, and in this average case, our virtualization techniques would add only 1.4%(the baseline would add over 1%). Our improvements affect throughput in addition to latency. To measure these effects, we ran microbenchmarks consisting of multi-threaded 4KB direct I/Os. For multi-threaded 4KB direct I/Os, we improved read IOPs by a factor of 7.3x (from 48.8K to 357.5K), andwriteiops by6.5x(from43.8kto 284.1K).

6 Throughput (MB/s) (1) BM base (2) BM (3) base (4) 3+Huge Pages (5) 4+Idle Figure 6: File server workload with 6 cores 6 File Server Workload (6) 5+Driver We next tested the end-to-end performance of running a server in a VM on a storage controller. We ran a file server in our VM, and used dbench [36] v4. to generate 4KB NFS read requests which all arrived at the 1Gb NIC, went through the local file system and block layers, through the para-virtualized block interface, and were satisfied by the ramdisk on the host side. We always allocated two cores to the controller function, and either two or four to the file server (as specified). In the virtualized cases, all file server cores were given to the VM.All coreswerefullyutilizedforallcases. Figure 6 shows the results when running with six cores: twoforthecontrollerfunctionandfourforthefile server. Bars 1 and 2 show the bare metal case without and with, respectively. Roughly the same performance is attained in both cases. The third bar shows the baseline measurement for the guest, which is a significant degradation as compared to the bare metal cases. We identified three main causes for this performance drop. First, we noticed a large number of page faults on the host caused by the running VM. We mitigated this using the Linux kernel s HugePages mechanism, which backs a given process with 2MB pages instead of 4KB pages. This allows the OS to store fewer TLB page entries, resulting in fewer TLB faults and fewer EPT table lookups. HugePages improved performance by 1.5%, as showninthefourthbaroffigure6. A featureinarecent Linux kernel release makes the use of HugePages automatic [1]. The second issue affecting performance was haltexits,describedinsection4. Weavoidtheseexitsby setting the guest scheduler to poll. This further improved performance by 7.3% (fifth bar in Figure 6). The final performance improvement was to add driver, for both the network and block interfaces(described in Sections 4 and 5.2). This further improved performance by 19.7%, and brings the guest s performance to be statistically indistinguishable from bare metal. Next, we ran the same workload, but this time allocated only two cores to the file server (four cores total). This may be a more common deployment when running multiple server VMs on a single physical host, for example, because there are less cores available for each VM. The bare metal results are depicted in Figure 7(a). The first bar shows the bare metal baseline performance of MB/s. We see in the second bar that performance Throughput (MB/s) Throughput (MB/s) (1) BM base (1) base (2) BM (a) Bare Metal (2) opt (b) (3) (3) 2+ Priority 37.5 (4) 6+ Priority (4) 3+ Affinity Figure 7: File server workload with 4 cores (5) 7+ Affinity drops to MB/s when using. This is because the host now has only two cores, and the thread utilizes a disproportionate amount of CPU resources as compared to the file server. We remedied this by reducing the CPU scheduling priority of the thread(bar 3), and by setting the CPU affinities of the thread and some of the file server processes so that they share the same core (bar 4). These two changes bring the performance back to baseline performance. In the guest case, depicted in Figure 7(b), the baseline (bar 1) is approximately 36% lower than the bare metal case. Bar 2 includes the HugePages and idle optimizations previously described, and bar 3 adds driver. Similar to the bare metal case, we adjusted the thread scheduling priority and the affinities of the relevantprocesses(bars4and5). Thisbringsustoresults that are statistically indistinguishable from bare metal. In all cases, tuning was not difficult, and a wide range of values provided the achieved performance. 7 Related Work Several works explored the idea of running VMs on storage controllers. The IBM DS83 storage controller uses logical partitions (LPARs) to enable the creation of two fault-isolated and performance-isolated virtual storage systems on one physical controller[12]. Pivot3 [24] and ParaScale[22] are integrated virtualization and scaleout SAN storage platforms that are geared to data centers. Fido[8] investigated using shared memory to implement zero-copy inter-vm communication in Xen in the context of enterprise-class server appliances. Our focus is different in that we investigate external communication, zero-copy communication with the controller software, and various techniques and methods to reduce overheads caused by I/O virtualization. Block Mason [21] used building blocks implemented in VMs to extend block storage functionality. VMware

7 VSA [37] pools the internal storage resources of several servers in a shared storage pool, using dedicated virtual machines running on each server. Several works explored off-loading I/O to dedicated cores [3,15,16,19]. The closest to ours is VPE [19], which adds host-side to KVM s virtio network stack. The VPE thread polls the network device for incoming packets and polls the guest device driver for new requests. However, the guest incurs exit overheads for interrupts and I/O completions since its driver does not poll. Dedicating cores for improving I/O performance hasalsobeenexploredintcp onloading[25,32,33]. There have been several works that investigated reducing interrupt overhead. The Linux kernel uses NAPI to disable interrupts of incoming packets as long as there are packets to be processed [3, 31]. A hybrid approach is to use interrupts under low load, and when more throughput is needed [11]. With interrupt coalescing, a single interruptis generatedfor a givennumberof eventsorin apre-definedtime period[2,27]. A seriesof works compared these techniques qualitatively and quantitatively [28, 29]. Rather than for fixed intervals or according to arrival rates, QAPolling uses the system state as determined by applications receive queues [9]. The Polling Watchdog uses a hardware extension to trigger interrupts only when fails to handle a message inatimelymanner[2]. ELI (ExitLess Interrupts [13]) is a recently-published software-only approach for handling interrupts within guest virtual machines directly and securely. ELI removes the host from the interrupt handling paths, thereby allowing guests to reach 97% 1% of bare-metal performance for I/O-intensive workloads. 8 Conclusions and Future Work We have shown how to use a hypervisorto host and isolate new storage system functions with negligible runtime performance overhead. The techniques we demonstrated such as, dedicated cores, avoiding page lookups, etc.,whilenotgeneralpurposeareagoodfittoourusage scenario and have a significant payback. There are several possible extensions. First, ELI [13] is a promising new approach for exitless interrupts which would remove the need to poll in the guest. We are investigating incorporating it into our system. Second, if we stay with, we can explore ways to better utilize the cores, e.g., to on-board the TCP stack to a core. Third, we can also benchmark these techniques when running multiple VMs. Finally, we can examine how to leverage the fact that we have virtualized the new storage function s implementation to take advantage of features such as VM migration to improve performance and availability. Acknowledgments Thank you to our shepherd Arkady Kanevsky and the reviewers for their helpful comments. Thanks to Zorik Machulsky and Michael Vasiliev for assisting with the hardware. The research leading to the results presented in this paper is partially supported by the European Communitys Seventh Framework Programme([FP7/21-213]) under grant agreement number (IOLanes). References [1] K. Adams and O. Agesen. A comparison of software and hardware techniques for x86 virtualization. In Architectural Support for Programming Languages& Operating Systems(ASPLOS), 26. [2] I. Ahmad, A. Gulati, and A. Mashtizadeh. vic: Interrupt coalescing for virtual machine storage device IO. In USENIX Annual Technical Conf., 211. [3] N. Amit, M. Ben-Yehuda, D. Tsafrir, and A. Schuster. viommu: efficient IOMMU emulation. In USENIX Annual Technical Conf., 211. [4] P.Barham,B.Dragovic,K.Fraser,S.Hand,T.Harris,A.Ho,R.Neugebauer,I.Pratt,andA.Warfield. Xen and the art of virtualization. In ACM Symposium on Operating Systems Principles (SOSP). ACM SIGOPS, October 23. [5] M. Ben-Yehuda, M. D. Day, Z. Dubitzky, M. Factor, N. Har El, A. Gordon, A. Liguori, O. Wasserman, and B. Yassour. The Turtles project: Design and implementation of nested virtualization. In USENIX Symposium on Operating Systems Design& Implementation(OSDI), 21. [6] M. Ben-Yehuda, J. Mason, J. Xenidis, O. Krieger, L. van Doorn, J. Nakajima, A. Mallick, and E. Wahlig. Utilizing IOMMUs for virtualization in Linux and Xen. In Ottawa Linux Symposium(OLS), July 26. [7] M. Ben-Yehuda, J. Xenidis, M. Ostrowski, K. Rister, A. Bruemmer, and L. van Doorn. The price of safety: Evaluating IOMMU performance. In the 27 Ottawa Linux Symposium, June 27. [8] A. Burtsev, K. Srinivasan, P. Radhakrishnan, L. N. Bairavasundaram, K. Voruganti, and G. R. Goodson. Fido: Fast inter-virtual-machine communication for enterprise appliances. In USENIX Annual Technical Conf., June 29. [9] X. Chang,J. Muppala,W. Kong,P. Zou,X. Li, and Z. Zheng. A queue-based adaptive scheme to improve system performance in gigabit ethernet networks. In IEEE International Performance Computing and Communications Conf., 27. [1] J. Corbet. Transparent hugepages in http: //lwn.net/articles/423584/, January 211.

8 [11] C. Dovrolis, B. Thayer, and P. Ramanathan. Hip: Hybrid interrupt- for the network interface. ACM SIGOPS Operating Systems Review, 35(4):5 6, October 21. [12] B. Dufrasne, A. Baer, P. Klee, and D. Paulin. IBM system storage DS8: Architecture and implementation. Technical Report SG , IBM, October 29. [13] A. Gordon, N. Amit, N. Har El, M. Ben-Yehuda, A. Landau, D. Tsafrir, and A. Schuster. ELI: Baremetal performance for I/O virtualization. In Architectural Support for Programming Languages& Operating Systems(ASPLOS), 212. [14] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. kvm: the Linux virtual machine monitor. In the 27 Ottawa Linux Symposium, volume1,june27. [15] S. Kumar, H. Raj, K. Schwan, and I. Ganev. Rearchitecting VMMs for multicore systems: The sidecore approach. In Workshop on Interaction between Opearting Systems& Computer Architecture (WIOSCA), 27. [16] A. Landau, M. Ben-Yehuda, and A. Gordon. SplitX: Split guest/hypervisor execution on multicore. In USENIX Workshop on I/O Virtualization (WIOV), 211. [17] J. LeVasseur, V. Uhlig, J. Stoess, and S. Götz. Unmodified device driver reuse and improved system dependability via virtual machines. In USENIX Symposium on Operating Systems Design& Implementation(OSDI), 24. [18] J. Liu. Evaluating standard-based self-virtualizing devices: Aperformancestudyon1GbENICswith SR-IOV support. In IEEE International Parallel & Distributed Processing Symposium(IPDPS), 21. [19] J. Liu and B. Abali. Virtualization engine (VPE): Using dedicated cpu cores to accelerate I/O virtualization. In the 23rd international conference on Supercomputing, June 29. [2] O. Maquelin, G. R. Gao, H. H. J. Hum, K. B. Theobald, and X. Tian. Polling watchdog: Combining and interrupts for efficient message handling. ACM SIGARCH Computer Architecture News, 24(2): , May [21] D. T. Meyer, B. Cully, J. Wires, N. C. Hutchinson, and A. Warfield. Block mason. In USENIX Workshop on I/O Virtualization(WIOV), 28. [22] ParaScale. Cloud computing and the parascale platform: An inside view to cloud storage adoption. White Paper, 21. [23] PCI-SIG. Single root I/O virtualization 1.1 specification, January specifications/iov/single_root/. [24] Pivot3. Pivot3 serverless computing technology overview. White Paper, April 21. [25] G. Regnier, S. Makineni, I. Illikkal, R. Iyer, D. Minturn, R. Huggahalli, D. Newell, L. Cline, and A. Foong. Tcp onloading for data center servers. IEEE Computer, 37(11):48 48, 24. [26] R. Russell. virtio: Towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review, 42(5):95 13, July 28. [27] K. Salah. To coalesce or not to coalesce. International Journal of Electronics and Communications, 61(4): , 27. [28] K. Salah, K. El-Badawi, and F. Haidari. Performance analysis and comparison of interrupthandling schemes in gigabit networks. Journal of Computer Communications, 3(17): , 27. [29] K. Salah and A. Qahtan. Implementation and experimental performance evaluation of a hybrid interrupt-handling scheme. Journal of Computer Communications, 32(1): , 29. [3] J. Salim. When NAPI Comes To Town. In Linux 25 Conf., August 25. [31] J. H. Salim, R. Olsson, and A. Kuznetsov. Beyond Softnet. In Anual Linux Showcase& Conf., 21. [32] L. Shalev, V. Makhervaks, Z. Machulsky, G. Biran, J. Satran, M. Ben-Yehuda, and I. Shimony. Loosely coupled TCP acceleration architecture. In The 14th IEEE Symposium on High-Performance Interconnects, August 26. [33] L. Shalev, J. Satran, E. Borovik, and M. Ben- Yehuda. IsoStack highly efficient network processing on dedicated cores. In USENIX Annual Technical Conf., June 21. [34] SPC. Storage Performance Council, 27. www. storageperformance.org. [35] J. Sugerman, G. Venkitachalam, and B. Lim. Virtualizing I/O devices on VMware workstation s hosted virtual machine monitor. In USENIX Annual Technical Conf.. USENIX Association, 21. [36] A. Tridgell and R. Sahlberg. DBENCH. dbench.samba.org/, 211. [37] VMware. Virtual storage appliance. vmware.com/products/datacenter-virtualization/ vsphere/vsphere-storage-appliance/overview.html. [38] P. Willmann, S. Rixner, and A. L. Cox. Protection strategies for direct access to virtualized I/O devices. In USENIX Annual Technical Conf.. USENIX Association, June 28. [39] B. Yassour, M. Ben-Yehuda, and O. Wasserman. Direct device assignment for untrusted fullyvirtualized virtual machines. Technical report, IBM Research Report H-263, 28.

IBM Research Report. Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines

IBM Research Report. Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines H-0263 (H0809-006) September 20, 2008 Computer Science IBM Research Report Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines Ben-Ami Yassour, Muli Ben-Yehuda, Orit Wasserman IBM

More information

Virtualization History and Future Trends

Virtualization History and Future Trends Virtualization History and Future Trends Christoffer Dall - Candidacy Exam - January 2013 Columbia University - Computer Science Department IBM Mainframe VMs VMware Workstation x86 Hardware Support Virtual

More information

Scalable I/O A Well-Architected Way to Do Scalable, Secure and Virtualized I/O

Scalable I/O A Well-Architected Way to Do Scalable, Secure and Virtualized I/O Scalable I/O A Well-Architected Way to Do Scalable, Secure and Virtualized I/O Julian Satran Leah Shalev Muli Ben-Yehuda Zorik Machulsky satran@il.ibm.com leah@il.ibm.com muli@il.ibm.com machulsk@il.ibm.com

More information

Bare-Metal Performance for x86 Virtualization

Bare-Metal Performance for x86 Virtualization Bare-Metal Performance for x86 Virtualization Muli Ben-Yehuda Technion & IBM Research Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for x86 Virtualization Boston University, 2012 1 / 49 Background:

More information

A Novel Approach to Gain High Throughput and Low Latency through SR- IOV

A Novel Approach to Gain High Throughput and Low Latency through SR- IOV A Novel Approach to Gain High Throughput and Low Latency through SR- IOV Usha Devi G #1, Kasthuri Theja Peduru #2, Mallikarjuna Reddy B #3 School of Information Technology, VIT University, Vellore 632014,

More information

vpipe: One Pipe to Connect Them All!

vpipe: One Pipe to Connect Them All! vpipe: One Pipe to Connect Them All! Sahan Gamage, Ramana Kompella, Dongyan Xu Department of Computer Science, Purdue University {sgamage,kompella,dxu}@cs.purdue.edu Abstract Many enterprises use the cloud

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

Performance Considerations of Network Functions Virtualization using Containers

Performance Considerations of Network Functions Virtualization using Containers Performance Considerations of Network Functions Virtualization using Containers Jason Anderson, et al. (Clemson University) 2016 International Conference on Computing, Networking and Communications, Internet

More information

ELI: Bare-Metal Performance for I/O Virtualization

ELI: Bare-Metal Performance for I/O Virtualization : Bare-Metal Performance for I/O Virtualization Abel Gordon 1 Nadav Amit 2 Nadav Har El 1 Muli Ben-Yehuda 21 Alex Landau 1 Assaf Schuster 2 Dan Tsafrir 2 1 IBM Research Haifa 2 Technion Israel Institute

More information

Virtualization, Xen and Denali

Virtualization, Xen and Denali Virtualization, Xen and Denali Susmit Shannigrahi November 9, 2011 Susmit Shannigrahi () Virtualization, Xen and Denali November 9, 2011 1 / 70 Introduction Virtualization is the technology to allow two

More information

On the DMA Mapping Problem in Direct Device Assignment

On the DMA Mapping Problem in Direct Device Assignment On the DMA Mapping Problem in Direct Device Assignment Ben-Ami Yassour Muli Ben-Yehuda Orit Wasserman benami@il.ibm.com muli@il.ibm.com oritw@il.ibm.com IBM Research Haifa On the DMA Mapping Problem in

More information

The Price of Safety: Evaluating IOMMU Performance

The Price of Safety: Evaluating IOMMU Performance The Price of Safety: Evaluating IOMMU Performance Muli Ben-Yehuda 1 Jimi Xenidis 2 Michal Ostrowski 2 Karl Rister 3 Alexis Bruemmer 3 Leendert Van Doorn 4 1 muli@il.ibm.com 2 {jimix,mostrows}@watson.ibm.com

More information

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks LINUX-KVM The need for KVM x86 originally virtualization unfriendly No hardware provisions Instructions behave differently depending on privilege context(popf) Performance suffered on trap-and-emulate

More information

Advanced Operating Systems (CS 202) Virtualization

Advanced Operating Systems (CS 202) Virtualization Advanced Operating Systems (CS 202) Virtualization Virtualization One of the natural consequences of the extensibility research we discussed What is virtualization and what are the benefits? 2 Virtualization

More information

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Aalto University School of Electrical Engineering Degree Programme in Communications Engineering Supervisor:

More information

IBM Research Report. The Turtles Project: Design and Implementation of Nested Virtualization

IBM Research Report. The Turtles Project: Design and Implementation of Nested Virtualization H-0282 (H1001-004) January 9, 2010 Computer Science IBM Research Report The Turtles Project: Design and Implementation of Nested Virtualization Muli Ben-Yehuda 1, Michael D. Day 2, Zvi Dubitzky 1, Michael

More information

Nested Virtualization and Server Consolidation

Nested Virtualization and Server Consolidation Nested Virtualization and Server Consolidation Vara Varavithya Department of Electrical Engineering, KMUTNB varavithya@gmail.com 1 Outline Virtualization & Background Nested Virtualization Hybrid-Nested

More information

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization: Software and hardware support for Network Virtualization part 2 Knut Omang Ifi/Oracle 20 Oct, 2015 32 Overview Introduction to virtualization (Virtual machines) Aspects of network virtualization: Virtual

More information

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand Introduction to Virtual Machines Nima Honarmand Virtual Machines & Hypervisors Virtual Machine: an abstraction of a complete compute environment through the combined virtualization of the processor, memory,

More information

Chapter 5 C. Virtual machines

Chapter 5 C. Virtual machines Chapter 5 C Virtual machines Virtual Machines Host computer emulates guest operating system and machine resources Improved isolation of multiple guests Avoids security and reliability problems Aids sharing

More information

RDMA-like VirtIO Network Device for Palacios Virtual Machines

RDMA-like VirtIO Network Device for Palacios Virtual Machines RDMA-like VirtIO Network Device for Palacios Virtual Machines Kevin Pedretti UNM ID: 101511969 CS-591 Special Topics in Virtualization May 10, 2012 Abstract This project developed an RDMA-like VirtIO network

More information

System Virtual Machines

System Virtual Machines System Virtual Machines Outline Need and genesis of system Virtual Machines Basic concepts User Interface and Appearance State Management Resource Control Bare Metal and Hosted Virtual Machines Co-designed

More information

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization. Virtualization Basics Motivation OS Virtualization CSC 456 Final Presentation Brandon D. Shroyer Types of Virtualization Process virtualization (Java) System virtualization (classic, hosted) Emulation

More information

Improving Performance by Bridging the Semantic Gap between Multi-queue SSD and I/O Virtualization Framework

Improving Performance by Bridging the Semantic Gap between Multi-queue SSD and I/O Virtualization Framework Improving Performance by Bridging the Semantic Gap between Multi-queue SSD and I/O Virtualization Framework Tae Yong Kim *, Dong Hyun Kang *, Dongwoo Lee *, Young Ik Eom * * College of Information and

More information

Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner

Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner HP Laboratories HPL-21-39 Keyword(s): No keywords available. Abstract:

More information

IBM Research Report. The Turtles Project: Design and Implementation of Nested Virtualization

IBM Research Report. The Turtles Project: Design and Implementation of Nested Virtualization H-0282 (H1001-004) January 9, 2010 Computer Science IBM Research Report The Turtles Project: Design and Implementation of Nested Virtualization Muli Ben-Yehuda 1, Michael D. Day 2, Zvi Dubitzky 1, Michael

More information

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy COMPUTER ARCHITECTURE Virtualization and Memory Hierarchy 2 Contents Virtual memory. Policies and strategies. Page tables. Virtual machines. Requirements of virtual machines and ISA support. Virtual machines:

More information

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate NIC-PCIE-1SFP+-PLU PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate Flexibility and Scalability in Virtual

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 16: Virtual Machine Monitors Geoffrey M. Voelker Virtual Machine Monitors 2 Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot

More information

Module 1: Virtualization. Types of Interfaces

Module 1: Virtualization. Types of Interfaces Module 1: Virtualization Virtualization: extend or replace an existing interface to mimic the behavior of another system. Introduced in 1970s: run legacy software on newer mainframe hardware Handle platform

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

IOMMU: Strategies for Mitigating the IOTLB Bottleneck

IOMMU: Strategies for Mitigating the IOTLB Bottleneck Author manuscript, published in "WIOSCA 2010 - Sixth Annual Workshorp on the Interaction between Operating Systems and Computer Architecture (2010)" IOMMU: Strategies for Mitigating the IOTLB Bottleneck

More information

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant. 24-vm.txt Mon Nov 21 22:13:36 2011 1 Notes on Virtual Machines 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Tannenbaum, 3.2 Barham, et al., "Xen and the art of virtualization,"

More information

Multiprocessor Scheduling. Multiprocessor Scheduling

Multiprocessor Scheduling. Multiprocessor Scheduling Multiprocessor Scheduling Will consider only shared memory multiprocessor or multi-core CPU Salient features: One or more caches: cache affinity is important Semaphores/locks typically implemented as spin-locks:

More information

Measurement-based Analysis of TCP/IP Processing Requirements

Measurement-based Analysis of TCP/IP Processing Requirements Measurement-based Analysis of TCP/IP Processing Requirements Srihari Makineni Ravi Iyer Communications Technology Lab Intel Corporation {srihari.makineni, ravishankar.iyer}@intel.com Abstract With the

More information

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized Starting Point: A Physical Machine Virtualization Based on materials from: Introduction to Virtual Machines by Carl Waldspurger Understanding Intel Virtualization Technology (VT) by N. B. Sahgal and D.

More information

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization Starting Point: A Physical Machine Virtualization Based on materials from: Introduction to Virtual Machines by Carl Waldspurger Understanding Intel Virtualization Technology (VT) by N. B. Sahgal and D.

More information

vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008

vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008 vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008 Virtual Datacenter OS from VMware Infrastructure vservices and Cloud vservices Existing New - roadmap Virtual Datacenter OS from VMware Agenda

More information

vrio: Efficient Paravirtual Remote I/O Yossi Kuperman

vrio: Efficient Paravirtual Remote I/O Yossi Kuperman vrio: Efficient Paravirtual Remote I/O Yossi Kuperman vrio: Efficient Paravirtual Remote I/O Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Sciencie in

More information

VALE: a switched ethernet for virtual machines

VALE: a switched ethernet for virtual machines L < > T H local VALE VALE -- Page 1/23 VALE: a switched ethernet for virtual machines Luigi Rizzo, Giuseppe Lettieri Università di Pisa http://info.iet.unipi.it/~luigi/vale/ Motivation Make sw packet processing

More information

Data Path acceleration techniques in a NFV world

Data Path acceleration techniques in a NFV world Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual

More information

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO 2014 International Conference on Embedded and Ubiquitous Computing Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO Antonios Motakis, Alvise Rigo, Daniel Raho Virtual Open Systems Grenoble,

More information

Lecture 5: February 3

Lecture 5: February 3 CMPSCI 677 Operating Systems Spring 2014 Lecture 5: February 3 Lecturer: Prashant Shenoy Scribe: Aditya Sundarrajan 5.1 Virtualization Virtualization is a technique that extends or replaces an existing

More information

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack Ian Jolliffe, Chris Friesen WHEN IT MATTERS, IT RUNS ON WIND RIVER. 2017 WIND RIVER. ALL RIGHTS RESERVED. Ian Jolliffe 2 2017

More information

Linux and Xen. Andrea Sarro. andrea.sarro(at)quadrics.it. Linux Kernel Hacking Free Course IV Edition

Linux and Xen. Andrea Sarro. andrea.sarro(at)quadrics.it. Linux Kernel Hacking Free Course IV Edition Linux and Xen Andrea Sarro andrea.sarro(at)quadrics.it Linux Kernel Hacking Free Course IV Edition Andrea Sarro (andrea.sarro(at)quadrics.it) Linux and Xen 07/05/2008 1 / 37 Introduction Xen and Virtualization

More information

Amazon EC2 Deep Dive. Michael #awssummit

Amazon EC2 Deep Dive. Michael #awssummit Berlin Amazon EC2 Deep Dive Michael Hanisch @hanimic #awssummit Let s get started Amazon EC2 instances AMIs & Virtualization Types EBS-backed AMIs AMI instance Physical host server New root volume snapshot

More information

KVM Virtualized I/O Performance

KVM Virtualized I/O Performance KVM Virtualized I/O Performance Achieving Leadership I/O Performance Using Virtio- Blk-Data-Plane Technology Preview in Red Hat Enterprise Linux 6.4 Khoa Huynh, Ph.D. - Linux Technology Center, IBM Andrew

More information

Improving CPU Performance of Xen Hypervisor in Virtualized Environment

Improving CPU Performance of Xen Hypervisor in Virtualized Environment ISSN: 2393-8528 Contents lists available at www.ijicse.in International Journal of Innovative Computer Science & Engineering Volume 5 Issue 3; May-June 2018; Page No. 14-19 Improving CPU Performance of

More information

W H I T E P A P E R. What s New in VMware vsphere 4: Performance Enhancements

W H I T E P A P E R. What s New in VMware vsphere 4: Performance Enhancements W H I T E P A P E R What s New in VMware vsphere 4: Performance Enhancements Scalability Enhancements...................................................... 3 CPU Enhancements............................................................

More information

Optimizing and Enhancing VM for the Cloud Computing Era. 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong

Optimizing and Enhancing VM for the Cloud Computing Era. 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong Optimizing and Enhancing VM for the Cloud Computing Era 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong Implications of Cloud Computing to Virtualization More computation and data processing

More information

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper Technical white paper Configuring SR-IOV with HP Virtual Connect and Microsoft Hyper-V Table of contents Abstract... 2 Overview... 2 SR-IOV... 2 Advantages and usage... 2 With Flex-10... 3 Setup... 4 Supported

More information

Virtual Machine Virtual Machine Types System Virtual Machine: virtualize a machine Container: virtualize an OS Program Virtual Machine: virtualize a process Language Virtual Machine: virtualize a language

More information

Virtual Machine Monitors (VMMs) are a hot topic in

Virtual Machine Monitors (VMMs) are a hot topic in CSE 120 Principles of Operating Systems Winter 2007 Lecture 16: Virtual Machine Monitors Keith Marzullo and Geoffrey M. Voelker Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot topic

More information

Evolution of the netmap architecture

Evolution of the netmap architecture L < > T H local Evolution of the netmap architecture Evolution of the netmap architecture -- Page 1/21 Evolution of the netmap architecture Luigi Rizzo, Università di Pisa http://info.iet.unipi.it/~luigi/vale/

More information

Mission-Critical Enterprise Linux. April 17, 2006

Mission-Critical Enterprise Linux. April 17, 2006 Mission-Critical Enterprise Linux April 17, 2006 Agenda Welcome Who we are & what we do Steve Meyers, Director Unisys Linux Systems Group (steven.meyers@unisys.com) Technical Presentations Xen Virtualization

More information

I/O and virtualization

I/O and virtualization I/O and virtualization CSE-C3200 Operating systems Autumn 2015 (I), Lecture 8 Vesa Hirvisalo Today I/O management Control of I/O Data transfers, DMA (Direct Memory Access) Buffering Single buffering Double

More information

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Virtual Machines Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today's Topics History and benefits of virtual machines Virtual machine technologies

More information

Dynamic Translator-Based Virtualization

Dynamic Translator-Based Virtualization Dynamic Translator-Based Virtualization Yuki Kinebuchi 1,HidenariKoshimae 1,ShuichiOikawa 2, and Tatsuo Nakajima 1 1 Department of Computer Science, Waseda University {yukikine, hide, tatsuo}@dcl.info.waseda.ac.jp

More information

Operating Systems 4/27/2015

Operating Systems 4/27/2015 Virtualization inside the OS Operating Systems 24. Virtualization Memory virtualization Process feels like it has its own address space Created by MMU, configured by OS Storage virtualization Logical view

More information

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System Abstract In this paper, we address the latency issue in RT- XEN virtual machines that are available in Xen 4.5. Despite the advantages of applying virtualization to systems, the default credit scheduler

More information

10 Steps to Virtualization

10 Steps to Virtualization AN INTEL COMPANY 10 Steps to Virtualization WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Virtualization the creation of multiple virtual machines (VMs) on a single piece of hardware, where

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges

More information

Live Virtual Machine Migration with Efficient Working Set Prediction

Live Virtual Machine Migration with Efficient Working Set Prediction 2011 International Conference on Network and Electronics Engineering IPCSIT vol.11 (2011) (2011) IACSIT Press, Singapore Live Virtual Machine Migration with Efficient Working Set Prediction Ei Phyu Zaw

More information

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems Lecture 7 Xen and the Art of Virtualization Paul Braham, Boris Dragovic, Keir Fraser et al. Advanced Operating Systems 16 November, 2011 SOA/OS Lecture 7, Xen 1/38 Contents Virtualization Xen Memory CPU

More information

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016 Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide

More information

Chapter 3 Virtualization Model for Cloud Computing Environment

Chapter 3 Virtualization Model for Cloud Computing Environment Chapter 3 Virtualization Model for Cloud Computing Environment This chapter introduces the concept of virtualization in Cloud Computing Environment along with need of virtualization, components and characteristics

More information

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives Virtual Machines Resource Virtualization Separating the abstract view of computing resources from the implementation of these resources

More information

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1 What s New in VMware vsphere 4.1 Performance VMware vsphere 4.1 T E C H N I C A L W H I T E P A P E R Table of Contents Scalability enhancements....................................................................

More information

A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation

More information

Network device virtualization: issues and solutions

Network device virtualization: issues and solutions Network device virtualization: issues and solutions Ph.D. Seminar Report Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Debadatta Mishra Roll No: 114050005

More information

Re-architecting Virtualization in Heterogeneous Multicore Systems

Re-architecting Virtualization in Heterogeneous Multicore Systems Re-architecting Virtualization in Heterogeneous Multicore Systems Himanshu Raj, Sanjay Kumar, Vishakha Gupta, Gregory Diamos, Nawaf Alamoosa, Ada Gavrilovska, Karsten Schwan, Sudhakar Yalamanchili College

More information

Todd Deshane, Ph.D. Student, Clarkson University Xen Summit, June 23-24, 2008, Boston, MA, USA.

Todd Deshane, Ph.D. Student, Clarkson University Xen Summit, June 23-24, 2008, Boston, MA, USA. Todd Deshane, Ph.D. Student, Clarkson University Xen Summit, June 23-24, 2008, Boston, MA, USA. Xen and the Art of Virtualization (2003) Reported remarkable performance results Xen and the Art of Repeated

More information

references Virtualization services Topics Virtualization

references Virtualization services Topics Virtualization references Virtualization services Virtual machines Intel Virtualization technology IEEE xplorer, May 2005 Comparison of software and hardware techniques for x86 virtualization ASPLOS 2006 Memory resource

More information

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization X86 operating systems are designed to run directly on the bare-metal hardware,

More information

Revisiting virtualized network adapters

Revisiting virtualized network adapters Revisiting virtualized network adapters Luigi Rizzo, Giuseppe Lettieri, Vincenzo Maffione, Università di Pisa, Italy rizzo@iet.unipi.it, http://info.iet.unipi.it/ luigi/vale/ Draft 5 feb 2013. Please do

More information

A Userspace Packet Switch for Virtual Machines

A Userspace Packet Switch for Virtual Machines SHRINKING THE HYPERVISOR ONE SUBSYSTEM AT A TIME A Userspace Packet Switch for Virtual Machines Julian Stecklina OS Group, TU Dresden jsteckli@os.inf.tu-dresden.de VEE 2014, Salt Lake City 1 Motivation

More information

Reducing CPU usage of a Toro Appliance

Reducing CPU usage of a Toro Appliance Reducing CPU usage of a Toro Appliance Matias E. Vara Larsen matiasevara@gmail.com Who am I? Electronic Engineer from Universidad Nacional de La Plata, Argentina PhD in Computer Science, Universite NiceSophia

More information

Making Nested Virtualization Real by Using Hardware Virtualization Features

Making Nested Virtualization Real by Using Hardware Virtualization Features Making Nested Virtualization Real by Using Hardware Virtualization Features May 28, 2013 Jun Nakajima Intel Corporation 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL

More information

CLOUD COMPUTING IT0530. G.JEYA BHARATHI Asst.Prof.(O.G) Department of IT SRM University

CLOUD COMPUTING IT0530. G.JEYA BHARATHI Asst.Prof.(O.G) Department of IT SRM University CLOUD COMPUTING IT0530 G.JEYA BHARATHI Asst.Prof.(O.G) Department of IT SRM University What is virtualization? Virtualization is way to run multiple operating systems and user applications on the same

More information

Virtualization. Michael Tsai 2018/4/16

Virtualization. Michael Tsai 2018/4/16 Virtualization Michael Tsai 2018/4/16 What is virtualization? Let s first look at a video from VMware http://www.vmware.com/tw/products/vsphere.html Problems? Low utilization Different needs DNS DHCP Web

More information

Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level Services

Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level Services Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level s Kartik Gopalan, Rohith Kugve, Hardik Bagdi, Yaohui Hu Binghamton University Dan Williams, Nilton Bila IBM T.J. Watson Research

More information

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation Nested Virtualization Update From Intel Xiantao Zhang, Eddie Dong Intel Corporation Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,

More information

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II CS-580K/480K Advanced Topics in Cloud Computing VM Virtualization II 1 How to Build a Virtual Machine? 2 How to Run a Program Compiling Source Program Loading Instruction Instruction Instruction Instruction

More information

Virtualisation: The KVM Way. Amit Shah

Virtualisation: The KVM Way. Amit Shah Virtualisation: The KVM Way Amit Shah amit.shah@qumranet.com foss.in/2007 Virtualisation Simulation of computer system in software Components Processor Management: register state, instructions, exceptions

More information

A Memory Management of Virtual Machines Created By KVM Hypervisor

A Memory Management of Virtual Machines Created By KVM Hypervisor A Memory Management of Virtual Machines Created By KVM Hypervisor Priyanka Career Point University Kota, India ABSTRACT virtualization logical abstraction of physical resource is possible which is very

More information

Achieve Low Latency NFV with Openstack*

Achieve Low Latency NFV with Openstack* Achieve Low Latency NFV with Openstack* Yunhong Jiang Yunhong.Jiang@intel.com *Other names and brands may be claimed as the property of others. Agenda NFV and network latency Why network latency on NFV

More information

Netchannel 2: Optimizing Network Performance

Netchannel 2: Optimizing Network Performance Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, 2007 2003 Hewlett-Packard Development

More information

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters Network Adapters IDEAL FOR DATACENTER, ENTERPRISE & ISP NETWORK SOLUTIONS FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 27 Virtualization Slides based on Various sources 1 1 Virtualization Why we need virtualization? The concepts and

More information

Nested Virtualization Friendly KVM

Nested Virtualization Friendly KVM Nested Virtualization Friendly KVM Sheng Yang, Qing He, Eddie Dong 1 Virtualization vs. Nested Virtualization Single-Layer Virtualization Multi-Layer (Nested) Virtualization (L2) Virtual Platform (L1)

More information

PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based)

PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based) NIC-PCIE-4SFP+-PLU PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based) Key Features Quad-port 10 GbE adapters PCI Express* (PCIe) 3.0, x8 Exceptional Low Power Adapters Network Virtualization

More information

SR-IOV Networking in Xen: Architecture, Design and Implementation

SR-IOV Networking in Xen: Architecture, Design and Implementation SR-IOV Networking in Xen: Architecture, Design and Implementation Yaozu Dong, Zhao Yu and Greg Rose Abstract. SR-IOV capable network devices offer the benefits of direct I/O throughput and reduced CPU

More information

Intel Virtualization Technology Roadmap and VT-d Support in Xen

Intel Virtualization Technology Roadmap and VT-d Support in Xen Intel Virtualization Technology Roadmap and VT-d Support in Xen Jun Nakajima Intel Open Source Technology Center Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS.

More information

Improve VNF safety with Vhost-User/DPDK IOMMU support

Improve VNF safety with Vhost-User/DPDK IOMMU support Improve VNF safety with Vhost-User/DPDK IOMMU support No UIO anymore! Maxime Coquelin Software Engineer KVM Forum 2017 AGENDA Background Vhost-user device IOTLB implementation Benchmarks Future improvements

More information

Virtualization and memory hierarchy

Virtualization and memory hierarchy Virtualization and memory hierarchy Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Motivation CPUs can not keep pace with network

Motivation CPUs can not keep pace with network Deferred Segmentation For Wire-Speed Transmission of Large TCP Frames over Standard GbE Networks Bilic Hrvoye (Billy) Igor Chirashnya Yitzhak Birk Zorik Machulsky Technion - Israel Institute of technology

More information

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels Virtualization Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 1 What is virtualization? Creating a virtual version of something o Hardware, operating system, application, network, memory,

More information

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班 I/O virtualization Jiang, Yunhong Yang, Xiaowei 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information