VIDAS: OBJECT-BASED VIRTUALIZED D A T A S H A R I N G F O R H I G H PERFORMANCE STORAGE I/O

Size: px
Start display at page:

Download "VIDAS: OBJECT-BASED VIRTUALIZED D A T A S H A R I N G F O R H I G H PERFORMANCE STORAGE I/O"

Transcription

1 Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas VIDAS: OBJECT-BASED VIRTUALIZED D A T A S H A R I N G F O R H I G H PERFORMANCE STORAGE I/O Pablo Llopis, Javier Garcia Blas, Florin Isaila, Jesus Carretero Computer Architecture and Technology Group (ARCOS) University Carlos III of Madrid, Spain ScienceCloud New York City, June 2013

2 2 Agenda 1. Motivation 2. Goals and common problems 3. Common storage I/O virtualization solutions 4. VIDAS Design and Implementation 5. Evaluation 6. Conclusion

3 3 Motivation HPC in virtualized clouds gaining popularity Increasing amount of data stresses importance of I/O performance I/O overhead in virtualized environments is still significant Agreement that POSIX not suitable for high performance Virtualized environments trade off protection for performance Lack of I/O coordination mechanisms across domains

4 4 Goals and contributions Propose new node-level abstractions and mechanisms in virtualized environments which: Enable building efficient virtualized data sharing Coordinate I/O across domains Provide shared access spaces across node-local domains Relax POSIX consistency Allow flexible data write and data read policies Expose data locality

5 5 Agenda 1. Motivation 2. Goals and common problems 3.Common storage I/O virtualization solutions 3.1. Block-level solution 3.2. Filesystem-level solution 4. VIDAS Design and Implementation 5. Evaluation 6. Conclusion

6 6 Virtualized block device drivers A1 A2 A3 Device Driver Hypervisor Device Driver Hypervisor Buffer Cache Scheduler Device Driver Front-end Hypervisor Device Driver Back-end Device Driver Device Driver Device Driver Host Host Host A1 virtualized block-level device on top of a physical device driver A2 virtualized block-level device on top of filesystem A3 paravirtualized device drivers

7 7 Virtualized filesystems B1 B2 B3 Buffer Cache Scheduler Device Driver Hypervisor Buffer Cache Scheduler Device Driver Host Buffer Cache Scheduler Device Driver Front-end Hypervisor Device Driver Back-end Device Driver Host Paravirtualized filesystem front-end Hypervisor Paravirtualized filesystem back-end Scheduler Device Driver Host B1 virtualized filesystem within a file (stacked filesystems) B2 paravirtualized filesystem B3 sophisticated paravirtualized filesystem, inter-domain coordination

8 8 Virtualized filesystems B1 B2 B3 Buffer Cache Scheduler Device Driver Hypervisor Buffer Cache Scheduler Device Driver Host Buffer Cache Scheduler Device Driver Front-end Hypervisor Device Driver Back-end Device Driver Host Paravirtualized filesystem front-end Hypervisor Paravirtualized filesystem back-end Scheduler Device Driver Host B1 virtualized filesystem within a file (stacked filesystems) B2 paravirtualized filesystem B3 sophisticated paravirtualized filesystem, inter-domain coordination

9 9 Agenda 1. Motivation 2. Goals and common problems 3.Common storage I/O virtualization solutions 4. VIDAS Design and Implementation 4.1. Abstractions 4.2. Design 4.3. Implementation 5. Evaluation

10 10 VIDAS Guest 0 Guest 1 Guest N-1 Storage

11 10 VIDAS Guest 0 Guest 1 Guest N-1 Object 0 Object 1 Object 2 Object M-1 Storage

12 11 Abstractions Objects Containers Guest 0 Guest 1 Guest N-1 Object 0 Object 1 Object 2 Object M-1 write update offset0, sz0 offset0, size0

13 12 Containers Containers are access control domains which allow to share objects among a set of guests. Guest 0 Guest 1 Guest N-1 Object 0 Object 1 Object 2 Object M-1 write update offset0, sz0 offset0, size0

14 13 Objects: POSIX differences Strong consistency not enforced, but optional. Data writes and updates guided by configurable policy: write-through or write-back Provide locality awareness Guest 0 Guest 1 Guest N-1 Object 0 Object 1 Object 2 Object M-1 Objects are uniquely associated to an external storage resource through its name offset0, sz0 write update offset0, size0

15 14 API Overview Container operations int container create(char *name, int domain ids[]) int container destroy(char* name) int container attach(char *name) int container leave(char* name) Object data operations int obj write(obj handle t o, char *buf, size t o set, size t sz) int obj read(obj handle t o, char *buf, size t o set, size t sz) int obj flush(obj handle t o) int obj update(obj handle t o) Object metadata operations obj handle t object create(char* ext storage rsc, size t o set, size t size, char* cname) obj handle t object join(char* ext storage rsc, size t o set, size t size, char* cname) int object get locality(char* ext storage rsc, obj handle t *objects[]) int object leave(obj handle t o) int object destroy(obj handle t o) int object getattr(obj handle t o, char *name, void *value, size t size) int object setattr(obj handle t o, char *name, void *value, size t size) Object synchronization operations int object wait(obj handle t o, char **bufp) int object notify(obj handle t o, char *buf)

16 15 Agenda 1. Motivation 2. Goals and common problems 3.Common storage I/O virtualization solutions 4. VIDAS Design 5. VIDAS Implementation 5.1. Inter-domain communication mechanisms 5.2. Implementation 6. Using VIDAS 7. Evaluation

17 16 Inter-domain communication Inter-domain data communication techniques Ring buffers Shared memory Domain A Domain 1 Domain 2 Ringbuffer Shared data Domain B Domain N

18 17 Inter-domain communication A high-efficient inter-domain data transferring system for virtual machines. (2009) pp Presented at the Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, New York, NY, USA: ACM. A study of Inter-domain Communication Mechanisms on Xen-based Hosting Platforms. (2009). Diakthaté F., Perache, M., Namyst, R., & Jourdren, H. (2009). Efficient Shared Memory Message Passing for Inter-VM Communications. Euro-Par 2008 Workshop Huang, W., Koop, M. J., Gao, Q., & Panda, D. K. (2007). Virtual machine aware communication libraries for high performance computing (p. 9). Presented at the SC '07 Inter-domain socket communications supporting high performance and full binary compatibility on Xen (pp ). Presented at the Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, New York, NY, USA: ACM Wei Huang, Koop, M. J., & Panda, D. K. (2008). Efficient one-copy MPI shared memory communication in Virtual Machines (pp ). Presented at the 2008 IEEE International Conference on Cluster Computing (CLUSTER), IEEE. All of these optimize inter-vm data sharing and communication, but none of them work around the limitation of being able to share pages with only 2 domains.

19 18 Xen/Linux grant reference extension Memory page Hypervisors usually provide a mechanism to share a page with only 2 domains domains at once. Domain 1 Domain 2 Memory page In our work we extended Linux/Xen to provide user-space applications with the ability to share a page with any number of domains. Domain 2 Domain 2 Domain 1

20 19 Paravirtualized data sharing While most solutions use mainly ringbuffers for paravirtualization, VIDAS combines ringbuffers and multi-domain shared memory Point-to-point Collective VM A VM B VM A VM B I/O Stack I/O Stack PV device driver Ringbuffer PV device driver Ringbuffer Shared data object Buffer cache I/O stack, device drivers I/O stack, device drivers

21 20 Implementation object_create object_join object_destroy object_leave object_get_locality object_flush object_update container_create container_destroy Front-end object_getattr object_setattr object_write object_read object_notify object_wait Object data Object attributes Hypervisor Front-end Ring buffer transport Container index Object index Back-end Host Ring buffer transport

22 21 Implementation Front-end container_create object_create Front-end A container creator specifies which domains are allowed to access the objects within the container Hypervisor Container index Object index Ring buffer transport Back-end Host Ring buffer transport

23 22 Implementation container_attach object_join Another domain can then access the objects in a container by joining in Front-end Front-end Hypervisor Container index Object index Ring buffer transport Back-end Host Ring buffer transport

24 23 object_wait object_read Front-end Object data Object attributes Hypervisor Front-end Ring buffer transport Container index Object index Back-end Host Ring buffer transport

25 24 object_getattr object_setatt object_write object_notify Front-end Object data Object attributes Hypervisor Front-end Ring buffer transport Container index Object index Back-end Host Ring buffer transport

26 25 Agenda 1. Motivation 2. Goals and common problems 3.Common storage I/O virtualization solutions 4. VIDAS Design and Implementation 5. Evaluation 6. Conclusion

27 26 Evaluation CPU Intel Xeon 12 core RAM 64GB DDR3 synchronous 1333Mhz HDD 1TB Toshiba MK1002TS Linux LVM ext4 Hypervisor Xen 4.2 Privileged domain Linux Guest domains Linux Synthetic Benchmarks: Broadcast, Independent read, Collective r/w

28 27 Bandwidth: In-memory communication 600" Bandwidth)(MB/s)) 500" 400" 300" 200" 100" 0" 2" 4" 8" 16" Number)of)Virtual)Machines) MPI-Bcast" VIDAS-Bcast" ringbuffer- Bcast" MPI-Bcast: share 128MB of data using MPI_Bcast VIDAS-Bcast: share 128MB of data using VIDAS ringbuffer-bcast: transfer 128MB of data to each VM using Xen ringbuffer transports

29 28 Independent file read (512MB) Throughput)(MB/s)) 160" 140" 120" 100" 80" 60" 40" 20" 0" 1" 2" 4" 8" 16" Number)of)Virtual)Machines) MPI+IO"(NFS)" VIDAS"I/O+Bcast" MPI+IO"(PVFS2)" MPI-IO (NFS) Concurrent file read on a node-local NFS mount MPI-IO (PVFS2) Concurrent file read on a node-local PVFS2 mount VIDAS I/O+Bcast Load data into object, broadcast to all VMs

30 29 Collective I/O (Two-phase I/O) Throughput)(MB/s)) 160" 140" 120" 100" 80" 60" 40" 20" 0" Read))))))))))))))))))))))))))))))))))))))))))))))Write) 1" 2" 4" 8" 16" 1" 2" 4" 8" 16" Number)of)Virtual)Machines) MPI+IO"collec1ve"(NFS)" VIDAS"collec1ve" MPI+IO"collec1ve"(PVFS2)" Non-overlappingly interleaved strided vectors of 2MB blocks

31 30 Agenda 1. Motivation 2. Goals and common problems 3.Common storage I/O virtualization solutions 4. VIDAS Design and Implementation 5. Evaluation 6. Conclusion

32 31 Conclusion and Contributions I/O overhead in virtualized environments is still significant Reduce memory copy operations Reduce domain context switches Agreement that POSIX not suitable for high performance Relax POSIX consistency Control data write and update policies Control data locality Virtualized environments trade off protection for performance Created shared access spaces Reduced memory copy operations and context switches Lack of I/O coordination mechanisms across domains Coordinate storage I/O across domains Proposed new multi-domain data sharing mechanisms

33 32 Questions? Q?

Exploring I/O Virtualization Data paths for MPI Applications in a Cluster of VMs: A Networking Perspective

Exploring I/O Virtualization Data paths for MPI Applications in a Cluster of VMs: A Networking Perspective Exploring I/O Virtualization Data paths for MPI Applications in a Cluster of VMs: A Networking Perspective Anastassios Nanos, Georgios Goumas, and Nectarios Koziris Computing Systems Laboratory, National

More information

RDMA-like VirtIO Network Device for Palacios Virtual Machines

RDMA-like VirtIO Network Device for Palacios Virtual Machines RDMA-like VirtIO Network Device for Palacios Virtual Machines Kevin Pedretti UNM ID: 101511969 CS-591 Special Topics in Virtualization May 10, 2012 Abstract This project developed an RDMA-like VirtIO network

More information

Virtualization and memory hierarchy

Virtualization and memory hierarchy Virtualization and memory hierarchy Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3

More information

Efficient Shared Memory Message Passing for Inter-VM Communications

Efficient Shared Memory Message Passing for Inter-VM Communications Efficient Shared Memory Message Passing for Inter-VM Communications François Diakhaté 1, Marc Perache 1,RaymondNamyst 2, and Herve Jourdren 1 1 CEA DAM Ile de France 2 University of Bordeaux Abstract.

More information

A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation

More information

Evaluation and Improvements of Programming Models for the Intel SCC Many-core Processor

Evaluation and Improvements of Programming Models for the Intel SCC Many-core Processor Evaluation and Improvements of Programming Models for the Intel SCC Many-core Processor Carsten Clauss, Stefan Lankes, Pablo Reble, Thomas Bemmerl International Workshop on New Algorithms and Programming

More information

Enhancing Checkpoint Performance with Staging IO & SSD

Enhancing Checkpoint Performance with Staging IO & SSD Enhancing Checkpoint Performance with Staging IO & SSD Xiangyong Ouyang Sonya Marcarelli Dhabaleswar K. Panda Department of Computer Science & Engineering The Ohio State University Outline Motivation and

More information

Nested Virtualization and Server Consolidation

Nested Virtualization and Server Consolidation Nested Virtualization and Server Consolidation Vara Varavithya Department of Electrical Engineering, KMUTNB varavithya@gmail.com 1 Outline Virtualization & Background Nested Virtualization Hybrid-Nested

More information

Efficient shared memory message passing for inter-vm communications

Efficient shared memory message passing for inter-vm communications Efficient shared memory message passing for inter-vm communications François Diakhaté, Marc Pérache, Raymond Namyst, Hervé Jourdren To cite this version: François Diakhaté, Marc Pérache, Raymond Namyst,

More information

Optimizing Local File Accesses for FUSE-Based Distributed Storage

Optimizing Local File Accesses for FUSE-Based Distributed Storage Optimizing Local File Accesses for FUSE-Based Distributed Storage Shun Ishiguro 1, Jun Murakami 1, Yoshihiro Oyama 1,3, Osamu Tatebe 2,3 1. The University of Electro-Communications, Japan 2. University

More information

QuartzV: Bringing Quality of Time to Virtual Machines

QuartzV: Bringing Quality of Time to Virtual Machines QuartzV: Bringing Quality of Time to Virtual Machines Sandeep D souza and Raj Rajkumar Carnegie Mellon University IEEE RTAS @ CPS Week 2018 1 A Shared Notion of Time Coordinated Actions Ordering of Events

More information

Coordinating Parallel HSM in Object-based Cluster Filesystems

Coordinating Parallel HSM in Object-based Cluster Filesystems Coordinating Parallel HSM in Object-based Cluster Filesystems Dingshan He, Xianbo Zhang, David Du University of Minnesota Gary Grider Los Alamos National Lab Agenda Motivations Parallel archiving/retrieving

More information

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters K. Kandalla, A. Venkatesh, K. Hamidouche, S. Potluri, D. Bureddy and D. K. Panda Presented by Dr. Xiaoyi

More information

Designing High Performance Communication Middleware with Emerging Multi-core Architectures

Designing High Performance Communication Middleware with Emerging Multi-core Architectures Designing High Performance Communication Middleware with Emerging Multi-core Architectures Dhabaleswar K. (DK) Panda Department of Computer Science and Engg. The Ohio State University E-mail: panda@cse.ohio-state.edu

More information

Data Center Virtualization: Xen and Xen-blanket

Data Center Virtualization: Xen and Xen-blanket Data Center Virtualization: Xen and Xen-blanket Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking November 17, 2014 Slides from ACM European

More information

Virtualization, Xen and Denali

Virtualization, Xen and Denali Virtualization, Xen and Denali Susmit Shannigrahi November 9, 2011 Susmit Shannigrahi () Virtualization, Xen and Denali November 9, 2011 1 / 70 Introduction Virtualization is the technology to allow two

More information

Is remote GPU virtualization useful? Federico Silla Technical University of Valencia Spain

Is remote GPU virtualization useful? Federico Silla Technical University of Valencia Spain Is remote virtualization useful? Federico Silla Technical University of Valencia Spain st Outline What is remote virtualization? HPC Advisory Council Spain Conference 2015 2/57 We deal with s, obviously!

More information

Virtual Machine Virtual Machine Types System Virtual Machine: virtualize a machine Container: virtualize an OS Program Virtual Machine: virtualize a process Language Virtual Machine: virtualize a language

More information

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures Min Li, Sudharshan S. Vazhkudai, Ali R. Butt, Fei Meng, Xiaosong Ma, Youngjae Kim,Christian Engelmann, and Galen Shipman

More information

Structuring PLFS for Extensibility

Structuring PLFS for Extensibility Structuring PLFS for Extensibility Chuck Cranor, Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University What is PLFS? Parallel Log Structured File System Interposed filesystem b/w

More information

Xen and the Art of Virtualization

Xen and the Art of Virtualization Xen and the Art of Virtualization Paul Barham,, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer,, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory Presented

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing Z. Sebepou, K. Magoutis, M. Marazakis, A. Bilas Institute of Computer Science (ICS) Foundation for Research and

More information

Virtualization (II) SPD Course 17/03/2010 Massimo Coppola

Virtualization (II) SPD Course 17/03/2010 Massimo Coppola Virtualization (II) SPD Course 17/03/2010 Massimo Coppola The players The Hypervisor (HV) implements the virtual machine emulation to run a Guest OS Provides resources and functionalities to the Guest

More information

1 Virtualization Recap

1 Virtualization Recap 1 Virtualization Recap 2 Recap 1 What is the user part of an ISA? What is the system part of an ISA? What functionality do they provide? 3 Recap 2 Application Programs Libraries Operating System Arrows?

More information

Mission-Critical Enterprise Linux. April 17, 2006

Mission-Critical Enterprise Linux. April 17, 2006 Mission-Critical Enterprise Linux April 17, 2006 Agenda Welcome Who we are & what we do Steve Meyers, Director Unisys Linux Systems Group (steven.meyers@unisys.com) Technical Presentations Xen Virtualization

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

OPERATING SYSTEMS: Lesson 11: Files

OPERATING SYSTEMS: Lesson 11: Files OPERATING SYSTEMS: Lesson 11: Files Jesús Carretero Pérez David Expósito Singh José Daniel García Sánchez Francisco Javier García Blas Florin Isaila 1 Goals To know the concepts of file and directory and

More information

CSC 5930/9010 Cloud S & P: Virtualization

CSC 5930/9010 Cloud S & P: Virtualization CSC 5930/9010 Cloud S & P: Virtualization Professor Henry Carter Fall 2016 Recap Network traffic can be encrypted at different layers depending on application needs TLS: transport layer IPsec: network

More information

Beyond Block I/O: Rethinking

Beyond Block I/O: Rethinking Beyond Block I/O: Rethinking Traditional Storage Primitives Xiangyong Ouyang *, David Nellans, Robert Wipfel, David idflynn, D. K. Panda * * The Ohio State University Fusion io Agenda Introduction and

More information

CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart

CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron, Hao Wang, Jian Huang, Dhabaleswar K. Panda Department of Computer

More information

Distributed caching for cloud computing

Distributed caching for cloud computing Distributed caching for cloud computing Maxime Lorrillere, Julien Sopena, Sébastien Monnet et Pierre Sens February 11, 2013 Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, 2013 1 / 16 Introduction Context

More information

Optimizing and Enhancing VM for the Cloud Computing Era. 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong

Optimizing and Enhancing VM for the Cloud Computing Era. 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong Optimizing and Enhancing VM for the Cloud Computing Era 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong Implications of Cloud Computing to Virtualization More computation and data processing

More information

White Paper. File System Throughput Performance on RedHawk Linux

White Paper. File System Throughput Performance on RedHawk Linux White Paper File System Throughput Performance on RedHawk Linux By: Nikhil Nanal Concurrent Computer Corporation August Introduction This paper reports the throughput performance of the,, and file systems

More information

NexentaVSA for View. Hardware Configuration Reference nv4v-v A

NexentaVSA for View. Hardware Configuration Reference nv4v-v A NexentaVSA for View Hardware Configuration Reference 1.0 5000-nv4v-v0.0-000003-A Copyright 2012 Nexenta Systems, ALL RIGHTS RESERVED Notice: No part of this publication may be reproduced or transmitted

More information

VirtFS A virtualization aware File System pass-through

VirtFS A virtualization aware File System pass-through VirtFS A virtualization aware File System pass-through Venkateswararao Jujjuri (JV) jvrao@us.ibm.com Linux Plumbers Conference 2010 Outline VirtFS Overview Why not traditional n/w FileSystems? Use Cases

More information

Authors : Ruslan Nikolaev Godmar Back Presented in SOSP 13 on Nov 3-6, 2013

Authors : Ruslan Nikolaev Godmar Back Presented in SOSP 13 on Nov 3-6, 2013 VirtuOS: An operating sytem with kernel virtualization Authors : Ruslan Nikolaev Godmar Back Presented in SOSP 13 on Nov 3-6, 2013 Presentation by Bien Aime MUGABARIGIRA Process Isolation and protection

More information

Albis: High-Performance File Format for Big Data Systems

Albis: High-Performance File Format for Big Data Systems Albis: High-Performance File Format for Big Data Systems Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Adrian Schuepbach, Bernard Metzler, IBM Research, Zurich 2018 USENIX Annual Technical Conference

More information

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives Virtual Machines Resource Virtualization Separating the abstract view of computing resources from the implementation of these resources

More information

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Sayantan Sur, Matt Koop, Lei Chai Dhabaleswar K. Panda Network Based Computing Lab, The Ohio State

More information

Independent consultant. (Ex-) Oracle ACE Director. Member of OakTable Network. Performance Troubleshooting In-house workshops

Independent consultant. (Ex-) Oracle ACE Director. Member of OakTable Network. Performance Troubleshooting In-house workshops Independent consultant Performance Troubleshooting In-house workshops Cost-Based Optimizer Performance By Design (Ex-) Oracle ACE Director 2009-2016 Alumni Member of OakTable Network http://oracle-randolf.blogspot.com

More information

Xen Summit Spring 2007

Xen Summit Spring 2007 Xen Summit Spring 2007 Platform Virtualization with XenEnterprise Rich Persaud 4/20/07 Copyright 2005-2006, XenSource, Inc. All rights reserved. 1 Xen, XenSource and XenEnterprise

More information

MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption

MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption Marc Pérache, Patrick Carribault, and Hervé Jourdren CEA, DAM, DIF F-91297 Arpajon, France {marc.perache,patrick.carribault,herve.jourdren}@cea.fr

More information

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy COMPUTER ARCHITECTURE Virtualization and Memory Hierarchy 2 Contents Virtual memory. Policies and strategies. Page tables. Virtual machines. Requirements of virtual machines and ISA support. Virtual machines:

More information

Virtual Leverage: Server Consolidation in Open Source Environments. Margaret Lewis Commercial Software Strategist AMD

Virtual Leverage: Server Consolidation in Open Source Environments. Margaret Lewis Commercial Software Strategist AMD Virtual Leverage: Server Consolidation in Open Source Environments Margaret Lewis Commercial Software Strategist AMD What Is Virtualization? Abstraction of Hardware Components Virtual Memory Virtual Volume

More information

NETWORK CODED STORAGE I/O SUBSYSTEM FOR HPC EXASCALE APPLICATIONS

NETWORK CODED STORAGE I/O SUBSYSTEM FOR HPC EXASCALE APPLICATIONS NETWORK CODED STORAGE I/O SUBSYSTEM FOR HPC EXASCALE APPLICATIONS 1 INTRODUCTION Intel predicts 10 times (4.4ZB to 44 ZB ) data explosion between 2013-2020. Massive data explosion makes legacy storage

More information

PM Support in Linux and Windows. Dr. Stephen Bates, CTO, Eideticom Neal Christiansen, Principal Development Lead, Microsoft

PM Support in Linux and Windows. Dr. Stephen Bates, CTO, Eideticom Neal Christiansen, Principal Development Lead, Microsoft PM Support in Linux and Windows Dr. Stephen Bates, CTO, Eideticom Neal Christiansen, Principal Development Lead, Microsoft Windows Support for Persistent Memory 2 Availability of Windows PM Support Client

More information

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016 Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide

More information

Cost-benefit analysis and exploration of cost-energy-performance trade-offs in scientific computing infrastructures

Cost-benefit analysis and exploration of cost-energy-performance trade-offs in scientific computing infrastructures Procedia Computer Science Volume 80, 2016, Pages 2256 2260 ICCS 2016. The International Conference on Computational Science Cost-benefit analysis and exploration of cost-energy-performance trade-offs in

More information

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Sayantan Sur, Abhinav Vishnu, Hyun-Wook Jin, Wei Huang and D. K. Panda {surs, vishnu, jinhy, huanwei, panda}@cse.ohio-state.edu

More information

RHEV in the weeds - special sauce! Marc Skinner

RHEV in the weeds - special sauce! Marc Skinner RHEV in the weeds - special sauce! Marc Skinner Twin Cities Users Group :: Q3/2013 Introduction RHEV = Red Hat Enterprise Vitualization RHEV Manager = Red Hat Enterprise Hypervisor Manager DATACENTER VIRTUALIZATION

More information

Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters

Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters Krishna Kandalla, Emilio P. Mancini, Sayantan Sur, and Dhabaleswar. K. Panda Department of Computer Science & Engineering,

More information

Increasing Cloud Power Efficiency through Consolidation Techniques

Increasing Cloud Power Efficiency through Consolidation Techniques Increasing Cloud Power Efficiency through Consolidation Techniques Antonio Corradi, Mario Fanelli, Luca Foschini Dipartimento di Elettronica, Informatica e Sistemistica (DEIS) University of Bologna, Italy

More information

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu, Wei Huang, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of Computer Science & Engineering

More information

Oracle for administrative, technical and Tier-0 mass storage services

Oracle for administrative, technical and Tier-0 mass storage services Oracle for administrative, technical and Tier-0 mass storage services openlab Major Review Meeting 2009 29 January 2009 Andrei Dumitru, Anton Topurov, Chris Lambert, Eric Grancher, Lucia Moreno Lopez,

More information

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems Lecture 7 Xen and the Art of Virtualization Paul Braham, Boris Dragovic, Keir Fraser et al. Advanced Operating Systems 16 November, 2011 SOA/OS Lecture 7, Xen 1/38 Contents Virtualization Xen Memory CPU

More information

Parallel Storage Systems for Large-Scale Machines

Parallel Storage Systems for Large-Scale Machines Parallel Storage Systems for Large-Scale Machines Doctoral Showcase Christos FILIPPIDIS (cfjs@outlook.com) Department of Informatics and Telecommunications, National and Kapodistrian University of Athens

More information

TEFS: A Flash File System for Use on Memory Constrained Devices

TEFS: A Flash File System for Use on Memory Constrained Devices 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) TEFS: A Flash File for Use on Memory Constrained Devices Wade Penson wpenson@alumni.ubc.ca Scott Fazackerley scott.fazackerley@alumni.ubc.ca

More information

OPERATING SYSTEMS: Lesson 13: File Systems

OPERATING SYSTEMS: Lesson 13: File Systems OPERATING SYSTEMS: Lesson 13: File Systems Jesús Carretero Pérez David Expósito Singh José Daniel García Sánchez Francisco Javier García Blas Florin Isaila 1 Goals To know the concepts of file and directory

More information

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory

More information

Virtualization and Performance

Virtualization and Performance Virtualization and Performance Network Startup Resource Center www.nsrc.org These materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International license (http://creativecommons.org/licenses/by-nc/4.0/)

More information

Benchmarking computers for seismic processing and imaging

Benchmarking computers for seismic processing and imaging Benchmarking computers for seismic processing and imaging Evgeny Kurin ekurin@geo-lab.ru Outline O&G HPC status and trends Benchmarking: goals and tools GeoBenchmark: modules vs. subsystems Basic tests

More information

Windows Support for PM. Tom Talpey, Microsoft

Windows Support for PM. Tom Talpey, Microsoft Windows Support for PM Tom Talpey, Microsoft Agenda Windows and Windows Server PM Industry Standards Support PMDK Support Hyper-V PM Support SQL Server PM Support Storage Spaces Direct PM Support SMB3

More information

Module 1: Virtualization. Types of Interfaces

Module 1: Virtualization. Types of Interfaces Module 1: Virtualization Virtualization: extend or replace an existing interface to mimic the behavior of another system. Introduced in 1970s: run legacy software on newer mainframe hardware Handle platform

More information

Memory hierarchy. 1. Module structure. 2. Basic cache memory. J. Daniel García Sánchez (coordinator) David Expósito Singh Javier García Blas

Memory hierarchy. 1. Module structure. 2. Basic cache memory. J. Daniel García Sánchez (coordinator) David Expósito Singh Javier García Blas Memory hierarchy J. Daniel García Sánchez (coordinator) David Expósito Singh Javier García Blas Computer Architecture ARCOS Group Computer Science and Engineering Department University Carlos III of Madrid

More information

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service)

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service) Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service) eddie.dong@intel.com arei.gonglei@huawei.com yanghy@cn.fujitsu.com Agenda Background Introduction Of COLO

More information

Operating Systems 4/27/2015

Operating Systems 4/27/2015 Virtualization inside the OS Operating Systems 24. Virtualization Memory virtualization Process feels like it has its own address space Created by MMU, configured by OS Storage virtualization Logical view

More information

Chapter 3 Virtualization Model for Cloud Computing Environment

Chapter 3 Virtualization Model for Cloud Computing Environment Chapter 3 Virtualization Model for Cloud Computing Environment This chapter introduces the concept of virtualization in Cloud Computing Environment along with need of virtualization, components and characteristics

More information

Xen and the Art of Virtualization

Xen and the Art of Virtualization Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield Presented by Thomas DuBuisson Outline Motivation

More information

icancloud Quick Installation Guide

icancloud Quick Installation Guide icancloud Quick Installation Guide Jesús Carretero Pérez Gabriel González Castañé Javier Prieto Cepeda Grupo de Arquitectura de Computadores Universidad Carlos III de Madrid 1 Table of contents 1 Introduction...

More information

HPC In The Cloud? Michael Kleber. July 2, Department of Computer Sciences University of Salzburg, Austria

HPC In The Cloud? Michael Kleber. July 2, Department of Computer Sciences University of Salzburg, Austria HPC In The Cloud? Michael Kleber Department of Computer Sciences University of Salzburg, Austria July 2, 2012 Content 1 2 3 MUSCLE NASA 4 5 Motivation wide spread availability of cloud services easy access

More information

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Feedback on BeeGFS. A Parallel File System for High Performance Computing Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December

More information

pnfs, POSIX, and MPI-IO: A Tale of Three Semantics

pnfs, POSIX, and MPI-IO: A Tale of Three Semantics Dean Hildebrand Research Staff Member PDSW 2009 pnfs, POSIX, and MPI-IO: A Tale of Three Semantics Dean Hildebrand, Roger Haskin Arifa Nisar IBM Almaden Northwestern University Agenda Motivation pnfs HPC

More information

High-performance aspects in virtualized infrastructures

High-performance aspects in virtualized infrastructures SVM 21 High-performance aspects in virtualized infrastructures Vitalian Danciu, Nils gentschen Felde, Dieter Kranzlmüller, Tobias Lindinger SVM 21 - HPC aspects in virtualized infrastructures 1/29/21 Niagara

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores

MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores Junbin Kang, Benlong Zhang, Tianyu Wo, Chunming Hu, and Jinpeng Huai Beihang University 夏飞 20140904 1 Outline Background

More information

VIProf: A Vertically Integrated Full-System Profiler

VIProf: A Vertically Integrated Full-System Profiler VIProf: A Vertically Integrated Full-System Profiler NGS Workshop, April 2007 Hussam Mousa Chandra Krintz Lamia Youseff Rich Wolski RACELab Research Dynamic software adaptation As program behavior or resource

More information

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Bogdan Nicolae (IBM Research, Ireland) Pierre Riteau (University of Chicago, USA) Kate Keahey (Argonne National

More information

xsim The Extreme-Scale Simulator

xsim The Extreme-Scale Simulator www.bsc.es xsim The Extreme-Scale Simulator Janko Strassburg Severo Ochoa Seminar @ BSC, 28 Feb 2014 Motivation Future exascale systems are predicted to have hundreds of thousands of nodes, thousands of

More information

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Windows Support for PM. Tom Talpey, Microsoft

Windows Support for PM. Tom Talpey, Microsoft Windows Support for PM Tom Talpey, Microsoft Agenda Industry Standards Support PMDK Open Source Support Hyper-V Support SQL Server Support Storage Spaces Direct Support SMB3 and RDMA Support 2 Windows

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

Power Efficiency of Hypervisor and Container-based Virtualization

Power Efficiency of Hypervisor and Container-based Virtualization Power Efficiency of Hypervisor and Container-based Virtualization University of Amsterdam MSc. System & Network Engineering Research Project II Jeroen van Kessel 02-02-2016 Supervised by: dr. ir. Arie

More information

The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011

The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011 The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011 HPC Scale Working Group, Dec 2011 Gilad Shainer, Pak Lui, Tong Liu,

More information

Virtualization. Michael Tsai 2018/4/16

Virtualization. Michael Tsai 2018/4/16 Virtualization Michael Tsai 2018/4/16 What is virtualization? Let s first look at a video from VMware http://www.vmware.com/tw/products/vsphere.html Problems? Low utilization Different needs DNS DHCP Web

More information

International Journal of Computer & Organization Trends Volume5 Issue3 May to June 2015

International Journal of Computer & Organization Trends Volume5 Issue3 May to June 2015 Performance Analysis of Various Guest Operating Systems on Ubuntu 14.04 Prof. (Dr.) Viabhakar Pathak 1, Pramod Kumar Ram 2 1 Computer Science and Engineering, Arya College of Engineering, Jaipur, India.

More information

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files Addressable by a filename ( foo.txt ) Usually supports hierarchical

More information

Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System

Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System Yoshihiro Yunomae Linux Technology Center Yokohama Research Lab. Hitachi, Ltd. 1 Introducing 1. Purpose of a low-overhead ring-buffer

More information

Virtualization with XEN. Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California

Virtualization with XEN. Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California Virtualization with XEN Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California A g e n d a Introduction Virtualization approaches Basic XEN Architecture Setting up XEN Bootstrapping

More information

A Comparison Study of Intel SGX and AMD Memory Encryption Technology

A Comparison Study of Intel SGX and AMD Memory Encryption Technology A Comparison Study of Intel SGX and AMD Memory Encryption Technology Saeid Mofrad, Fengwei Zhang Shiyong Lu Wayne State University {saeid.mofrad, Fengwei, Shiyong}@wayne.edu Weidong Shi (Larry) University

More information

Virtualization. Pradipta De

Virtualization. Pradipta De Virtualization Pradipta De pradipta.de@sunykorea.ac.kr Today s Topic Virtualization Basics System Virtualization Techniques CSE506: Ext Filesystem 2 Virtualization? A virtual machine (VM) is an emulation

More information

A Design of Hybrid Operating System for a Parallel Computer with Multi-Core and Many-Core Processors

A Design of Hybrid Operating System for a Parallel Computer with Multi-Core and Many-Core Processors A Design of Hybrid Operating System for a Parallel Computer with Multi-Core and Many-Core Processors Mikiko Sato 1,5 Go Fukazawa 1 Kiyohiko Nagamine 1 Ryuichi Sakamoto 1 Mitaro Namiki 1,5 Kazumi Yoshinaga

More information

SUPER CLOUD STORAGE MEASUREMENT STUDY AND OPTIMIZATION

SUPER CLOUD STORAGE MEASUREMENT STUDY AND OPTIMIZATION CS5413: HIGH PERFORMANCE SYSTEMS AND NETWORKING SUPER CLOUD STORAGE MEASUREMENT STUDY AND OPTIMIZATION December 23, 2014 Sneha Prasad (sh824@cornell.edu) Lu Yang (ly77@cornell.edu) Contents 1 Introduction....................................

More information

What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet.

What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet. 1 INTRODUCTION What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet. Cloud computing encompasses any Subscriptionbased or pay-per-use

More information

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks LINUX-KVM The need for KVM x86 originally virtualization unfriendly No hardware provisions Instructions behave differently depending on privilege context(popf) Performance suffered on trap-and-emulate

More information

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware 2010 VMware Inc. All rights reserved About the Speaker Hemant Gaidhani Senior Technical

More information

Hypervisor security. Evgeny Yakovlev, DEFCON NN, 2017

Hypervisor security. Evgeny Yakovlev, DEFCON NN, 2017 Hypervisor security Evgeny Yakovlev, DEFCON NN, 2017 whoami Low-level development in C and C++ on x86 UEFI, virtualization, security Jetico, Kaspersky Lab QEMU/KVM developer at Virtuozzo 2 Agenda Why hypervisor

More information

Netchannel 2: Optimizing Network Performance

Netchannel 2: Optimizing Network Performance Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, 2007 2003 Hewlett-Packard Development

More information