Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services
|
|
- Phoebe Fox
- 5 years ago
- Views:
Transcription
1 Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services Jacob R. Lorch Andrew Baumann Lisa Glendenning Dutch T. Meyer Andrew Warfield
2 Our goal: Turn existing binaries into faulttolerant services. Jay Lorch, Microsoft Research Tardigrade 2
3 Example: FDS Metadata Service FDS Metadata server FDS Cluster [Nightingale et al., OSDI 2012] Jay Lorch, Microsoft Research Tardigrade 3
4 Example: FDS Metadata Service FDS Metadata server Paxos leader election FDS Cluster [Nightingale et al., OSDI 2012] Jay Lorch, Microsoft Research Tardigrade 4
5 Techniques for making code fault-tolerant have limitations Use state machine replication library Explicitly persist state to reliable back-end Better: Transparently make the binary Requires development resources fault-tolerant Potential for oversight Non-determinism Failing to persist state Exposing non-persisted data Bugs in crash recovery Jay Lorch, Microsoft Research Tardigrade 5
6 Outline Motivation Background: Asynchronous VM replication Our solution: Lightweight VM replication Challenges and solutions Evaluation Jay Lorch, Microsoft Research Tardigrade 6
7 Outline Motivation Background: Asynchronous VM replication Our solution: Lightweight VM replication Challenges and solutions Evaluation Jay Lorch, Microsoft Research Tardigrade 7
8 Asynchronous virtual machine replication - Remus [Cully et al., NSDI 2008] Δ primary backup Primary can crash at any time; backup is always a bit behind. Jay Lorch, Microsoft Research Tardigrade 8
9 Asynchronous virtual machine replication - Remus [Cully et al., NSDI 2008] primary backup Output buffer Jay Lorch, Microsoft Research Tardigrade 9
10 Asynchronous virtual machine replication - Remus [Cully et al., NSDI 2008] Ack(Δ) primary backup Output buffer Jay Lorch, Microsoft Research Tardigrade 10
11 Latency of ping (ms) High VM activity can delay packets Baseline Safety Scan Search Indexer Update Deduplication Processes unrelated to the service 10 can balloon client-perceived latency. 1 50th quantile 95th quantile 99th quantile 99.9th quantile Jay Lorch, Microsoft Research Tardigrade 11
12 Outline Motivation Background: Asynchronous VM replication Our solution: Lightweight VM replication Challenges and solutions Evaluation Jay Lorch, Microsoft Research Tardigrade 12
13 Our solution: Use lightweight VMs instead Other processes Lightweight VM system examples Xax [Douceur et al., OSDI 2008] Native Client [Sehr et al., IEEE S&P 2009] Drawbridge [Porter et al., ASPLOS 2011] Embassies [Howell et al., NSDI 2013] Bascule [Baumann et al., Eurosys 2013] Service process Narrow API (e.g., ~45 calls in Bascule) LVM host Host OS Jay Lorch, Microsoft Research Tardigrade 13
14 Lightweight VMs can support unmodified binaries via a library OS Service process LVM host LVM API Jay Lorch, Microsoft Research Tardigrade 14
15 Lightweight VMs can support unmodified binaries via a library OS Service process Service binary Library OS OS API Bascule has a Windows LibOS and a Linux LibOS LVM host LVM API Jay Lorch, Microsoft Research Tardigrade 15
16 A lightweight VM is encapsulated by virtue of having a narrow interface Service process Service binary Library OS OS API LVM host LVM API Jay Lorch, Microsoft Research Tardigrade 16
17 Our approach: Checkpoint by interposing on existing LVM API Checkpoint Service process Service binary Library OS OS API Interposition using existing API means LVM and LibOS don t have to change Checkpointer LVM host LVM API LVM API Jay Lorch, Microsoft Research Tardigrade 17
18 [Cully et al., NSDI 2008] Asynchronous Virtual Machine Replication Lightweight Virtual Machine Replication Service Library OS Checkpointing Host Service Library OS Checkpointing Host primary backup primary backup Jay Lorch, Microsoft Research Tardigrade 18
19 [Cully et al., NSDI 2008] Asynchronous Virtual Machine Replication Our implementation of LVMR is called Tardigrade Lightweight Virtual Machine Replication Guest (service+os) Checkpointing Host Guest (service+os) Checkpointing Host primary backup primary backup Jay Lorch, Microsoft Research Tardigrade 19
20 Outline Motivation Background: Asynchronous VM replication Our solution: Lightweight VM replication Challenges and solutions Evaluation Jay Lorch, Microsoft Research Tardigrade 20
21 Practical LVMR poses challenges See paper for details Challenges Maintaining consistency across reconfigurations Achieving performance potential Checkpointing via an existing LVM API Solutions Vertical Paxos Incremental checkpointing, Lessons checkpoint for LVM capping, parallelism, API designers scaling send buffer size Quiescing, pre-checkpointing, enforcing determinism, terminating connections Jay Lorch, Microsoft Research Tardigrade 21
22 Checkpointing uses certain LVM API Feature Ability to track changed memory pages features Purpose Efficiently compute checkpoint deltas Ability to suspend and inspect other threads Determinism when API calls are replayed Host state either replayable or regeneratable Capture consistent snapshot Prevent divergence on failover Recreate host state on backup Jay Lorch, Microsoft Research Tardigrade 22
23 Features may not always be in LVM Feature Ability to track changed memory pages APIs Workaround Missing Ability ability to suspend to suspend and and inspect other other threads Non-determinism Determinism when when API calls API calls are are replayed Host state not either replayable or or regeneratable Use exceptions, precheckpointing Hide non-determinism Expose divergence as error condition Jay Lorch, Microsoft Research Tardigrade 23
24 To capture a checkpoint, we must quiesce and capture all threads state. Guest (service + library OS) Memory Checkpointing layer What if the API doesn t let a thread suspend and inspect another thread? Host primary Jay Lorch, Microsoft Research Tardigrade 24
25 We can use exceptions to quiesce guest threads Guest (service + library OS) Checkpoint Checkpointing layer Host primary Jay Lorch, Microsoft Research Tardigrade 25
26 Exception handler quiesces and captures each guest thread s state Guest (service + library OS) Checkpoint ExceptionHandler(, ) Memory Checkpointing layer Host primary Jay Lorch, Microsoft Research Tardigrade 26
27 Synchronous system calls complicate quiescence Guest (service + library OS) Checkpointing layer Host primary Jay Lorch, Microsoft Research Tardigrade 27
28 The wait system call is easy to deal with Guest (service + library OS) Checkpointing layer Host select() file descriptor list 0x1AC 0x3BB 0x907 time-to-checkpoint primary Jay Lorch, Microsoft Research Tardigrade 28
29 General synchronous system calls require pre-checkpointing Guest (service + library OS) Checkpointing layer Host primary Jay Lorch, Microsoft Research Tardigrade 29
30 API non-determinism undermines replay CreateSemaphore() returns descriptor 0xAAA CreateSemaphore() returns descriptor 0xBBB primary backup Jay Lorch, Microsoft Research Tardigrade 30
31 An indirection table can hide nondeterminism Guest (service + library OS) Guest (service + library OS) Checkpointing layer Guest descriptor Host descriptor 0x001 0xAAA 0x002 0x932 Host Checkpointing layer Guest descriptor Host descriptor 0x001 0xBBB 0x002 0x909 Host primary backup Jay Lorch, Microsoft Research Tardigrade 31
32 State external to guest needs to be replayable or regeneratable Guest (service + library OS) LVM API Checkpointing layer API provides sockets, not packets Host LVM API TCP session state Checkpointer can t capture TCP session state! primary backup Jay Lorch, Microsoft Research Tardigrade 32
33 System-specific modifications may be necessary Guest (service + library OS) Guest (service + library OS) TCP connections get dropped on a failover. Checkpointing layer Checkpointing layer Fixing this requires a major API change to make it use Host packets rather than sockets Host TCP session state primary backup Jay Lorch, Microsoft Research Tardigrade 33
34 Outline Motivation Background: Asynchronous VM replication Our solution: Lightweight VM replication Challenges and solutions Evaluation Jay Lorch, Microsoft Research Tardigrade 34
35 Latency of ping (ms) Effect of external processes - Remus Baseline Safety Scan Search Indexer Update Deduplication th quantile 95th quantile 99th quantile 99.9th quantile Jay Lorch, Microsoft Research Tardigrade 35
36 Latency (ms) Effect of external processes - Tardigrade Baseline Safety Scan Search Indexer Update Deduplication % 95% 99% 99.9% Quantile Jay Lorch, Microsoft Research Tardigrade 36
37 Latency (ms) Effect of external processes - Tardigrade 25 Baseline Safety Scan Search Indexer Update Deduplication % 95% 99% 99.9% Quantile Jay Lorch, Microsoft Research Tardigrade 37
38 CDF (%) Memory dirtying affects checkpoint latency No dirtying 10% of net b/w 20% of net b/w 30% of net b/w 40% of net b/w 50% of net b/w Latency (ms) Jay Lorch, Microsoft Research Tardigrade 38
39 CDF (%) FDS metadata service Metadata server initially idle Cluster operating normally Checkpoint delta average size: 0.9 MB Cluster starting up Checkpoint delta average size: 1.8 MB Checkpoint interval (ms) Jay Lorch, Microsoft Research Tardigrade 39
40 CDF (%) ZKLite, a simple non-fault-tolerant Java implementation of the Zookeeper API Client request latency (ms) Jay Lorch, Microsoft Research Tardigrade 40
41 Conclusions No changes to binaries needed, making deployment simple Replicating processes rather than VMs substantially Lightweight reduces worst-case VM Examples replication latencyof good is targets: practical for making Metadata existing services Lightweight virtual machine API Coordination designers should services service consider binaries effect on fault-tolerant replication Niche web services Reasonable performance if memory dirtying rate and load are low Jay Lorch, Microsoft Research Tardigrade 41
Services in the Virtualization Plane. Andrew Warfield Adjunct Professor, UBC Technical Director, Citrix Systems
Services in the Virtualization Plane Andrew Warfield Adjunct Professor, UBC Technical Director, Citrix Systems The Virtualization Plane Applications Applications OS Physical Machine 20ms 20ms in in the
More informationResearch on Availability of Virtual Machine Hot Standby based on Double Shadow Page Tables
International Conference on Computer, Networks and Communication Engineering (ICCNCE 2013) Research on Availability of Virtual Machine Hot Standby based on Double Shadow Page Tables Zhiyun Zheng, Huiling
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationApplication Fault Tolerance Using Continuous Checkpoint/Restart
Application Fault Tolerance Using Continuous Checkpoint/Restart Tomoki Sekiyama Linux Technology Center Yokohama Research Laboratory Hitachi Ltd. Outline 1. Overview of Application Fault Tolerance and
More informationApplications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
More informationSAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
1 SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics Qin Liu, John C.S. Lui 1 Cheng He, Lujia Pan, Wei Fan, Yunlong Shi 2 1 The Chinese University of Hong Kong 2 Huawei Noah s
More informationG Xen and Nooks. Robert Grimm New York University
G22.3250-001 Xen and Nooks Robert Grimm New York University Agenda! Altogether now: The three questions! The (gory) details of Xen! We already covered Disco, so let s focus on the details! Nooks! The grand
More informationZooKeeper. Table of contents
by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals... 2 1.2 Data model and the hierarchical namespace... 3 1.3 Nodes and ephemeral nodes...
More informationExpressCluster X 2.1 Update Information 16 June 2009
2.1 Update Information 16 June 2009 NEC Corporation Release Date 16 June, 2009 Release Outline Product Name 2.1 2.1 is a minor version up for 2.0. 2.0 license can be continuously used for 2.1 (Except for
More informationA Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers
A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationVM Migration, Containers (Lecture 12, cs262a)
VM Migration, Containers (Lecture 12, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 28, 2018 (Based in part on http://web.eecs.umich.edu/~mosharaf/slides/eecs582/w16/021516-junchenglivemigration.pptx)
More informationVirtualization for Desktop Grid Clients
Virtualization for Desktop Grid Clients Marosi Attila Csaba atisu@sztaki.hu BOINC Workshop 09, Barcelona, Spain, 23/10/2009 Using Virtual Machines in Desktop Grid Clients for Application Sandboxing! Joint
More informationIntroduction to MySQL InnoDB Cluster
1 / 148 2 / 148 3 / 148 Introduction to MySQL InnoDB Cluster MySQL High Availability made easy Percona Live Europe - Dublin 2017 Frédéric Descamps - MySQL Community Manager - Oracle 4 / 148 Safe Harbor
More informationEleos: Exit-Less OS Services for SGX Enclaves
Eleos: Exit-Less OS Services for SGX Enclaves Meni Orenbach Marina Minkin Pavel Lifshits Mark Silberstein Accelerated Computing Systems Lab Haifa, Israel What do we do? Improve performance: I/O intensive
More informationPhxSQL: A High-Availability & Strong-Consistency MySQL Cluster. Ming
PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster Ming CHEN@WeChat Why PhxSQL Highly expected features for MySql cluster Availability and consistency in MySQL cluster Master-slaves replication
More informationTransparent TCP Recovery
Transparent Recovery with Chain Replication Robert Burgess Ken Birman Robert Broberg Rick Payne Robbert van Renesse October 26, 2009 Motivation Us: Motivation Them: Client Motivation There is a connection...
More informationEXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University
EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors Junfeng Yang, Can Sar, Dawson Engler Stanford University Why check storage systems? Storage system errors are among the
More informationAbstract. Introduction
Highly Available In-Memory Metadata Filesystem using Viewstamped Replication (https://github.com/pkvijay/metadr) Pradeep Kumar Vijay, Pedro Ulises Cuevas Berrueco Stanford cs244b-distributed Systems Abstract
More informationSimpleChubby: a simple distributed lock service
SimpleChubby: a simple distributed lock service Jing Pu, Mingyu Gao, Hang Qu 1 Introduction We implement a distributed lock service called SimpleChubby similar to the original Google Chubby lock service[1].
More informationDeterministic Process Groups in
Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x x := t + 1 t := x x := t +
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,
More informationBetter Security with Virtual Machines
Better Security with Virtual Machines VMware Security Seminar Cambridge, 2006 Agenda VMware Evolution Virtual machine Server architecture Virtual infrastructure Looking forward VMware s security vision
More informationZhang Chen Zhang Chen Copyright 2017 FUJITSU LIMITED
Introduce Introduction And And Status Status Update Update About About COLO COLO FT FT Zhang Chen Zhang Chen Agenda Background Introduction
More informationComposing OS Extensions Safely and Efficiently with Bascule
Composing OS Extensions Safely and Efficiently with Bascule Andrew Baumann Dongyoon Lee Pedro Fonseca Lisa Glendenning Jacob R. Lorch Barry Bond Reuben Olinsky Galen C. Hunt Microsoft Research University
More informationFault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues
Fault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues By Narjess Ayari, Denis Barbaron, Laurent Lefevre and Pascale primet Presented by Mingyu Liu Outlines 1.Introduction
More informationDISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.
CHALLENGES Transparency: Slide 1 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems ➀ Introduction ➁ NFS (Network File System) ➂ AFS (Andrew File System) & Coda ➃ GFS (Google File System)
More informationPaxos Replicated State Machines as the Basis of a High- Performance Data Store
Paxos Replicated State Machines as the Basis of a High- Performance Data Store William J. Bolosky, Dexter Bradshaw, Randolph B. Haagens, Norbert P. Kusters and Peng Li March 30, 2011 Q: How to build a
More informationPushyDB. Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina,
PushyDB Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina, osong}@mit.edu https://github.com/jeffchan/6.824 1. Abstract PushyDB provides a more fully featured database that exposes
More informationiscsi Target Usage Guide December 15, 2017
December 15, 2017 1 Table of Contents 1. Native VMware Availability Options for vsan 1.1.Native VMware Availability Options for vsan 1.2.Application Clustering Solutions 1.3.Third party solutions 2. Security
More informationAdvanced Memory Management
Advanced Memory Management Main Points Applications of memory management What can we do with ability to trap on memory references to individual pages? File systems and persistent storage Goals Abstractions
More informationCurrent Topics in OS Research. So, what s hot?
Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general
More informationIntra-cluster Replication for Apache Kafka. Jun Rao
Intra-cluster Replication for Apache Kafka Jun Rao About myself Engineer at LinkedIn since 2010 Worked on Apache Kafka and Cassandra Database researcher at IBM Outline Overview of Kafka Kafka architecture
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationvbuckets: The Core Enabling Mechanism for Couchbase Server Data Distribution (aka Auto-Sharding )
vbuckets: The Core Enabling Mechanism for Data Distribution (aka Auto-Sharding ) Table of Contents vbucket Defined 3 key-vbucket-server ping illustrated 4 vbuckets in a world of s 5 TCP ports Deployment
More informationMySQL HA Solutions Selecting the best approach to protect access to your data
MySQL HA Solutions Selecting the best approach to protect access to your data Sastry Vedantam sastry.vedantam@oracle.com February 2015 Copyright 2015, Oracle and/or its affiliates. All rights reserved
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationDistributed Systems Final Exam
15-440 Distributed Systems Final Exam Name: Andrew: ID December 12, 2011 Please write your name and Andrew ID above before starting this exam. This exam has 14 pages, including this title page. Please
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationOS Extensibility: SPIN and Exokernels. Robert Grimm New York University
OS Extensibility: SPIN and Exokernels Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? OS Abstraction Barrier
More informationDMTCP: Fixing the Single Point of Failure of the ROS Master
DMTCP: Fixing the Single Point of Failure of the ROS Master Tw i n k l e J a i n j a i n. t @ h u s k y. n e u. e d u G e n e C o o p e r m a n g e n e @ c c s. n e u. e d u C o l l e g e o f C o m p u
More informationIntuitive distributed algorithms. with F#
Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype
More informationCARP: Handling silent data errors and site failures in an integrated program and storage replication mechanism
CARP: Handling silent data errors and site failures in an integrated program and storage replication mechanism Lanyue Lu, Prasenjit Sarkar, Dinesh Subhraveti, Soumitra Sarkar, Mark Seaman, Reshu Jain,
More informationXen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016
Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide
More informationCheckpointing with DMTCP and MVAPICH2 for Supercomputing. Kapil Arya. Mesosphere, Inc. & Northeastern University
MVAPICH Users Group 2016 Kapil Arya Checkpointing with DMTCP and MVAPICH2 for Supercomputing Kapil Arya Mesosphere, Inc. & Northeastern University DMTCP Developer Apache Mesos Committer kapil@mesosphere.io
More informationDistributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 2015 Exam 1 Review Paul Krzyzanowski Rutgers University Fall 2016 1 Question 1 Why did the use of reference counting for remote objects prove to be impractical? Explain. It s not fault
More informationFault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel
Fault Tolerant Java Virtual Machine Roy Friedman and Alon Kama Technion Haifa, Israel Objective Create framework for transparent fault-tolerance Support legacy applications Intended for long-lived, highly
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationCaching and reliability
Caching and reliability Block cache Vs. Latency ~10 ns 1~ ms Access unit Byte (word) Sector Capacity Gigabytes Terabytes Price Expensive Cheap Caching disk contents in RAM Hit ratio h : probability of
More informationPerformance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences
Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency
More informationCreating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint
Creating the Fastest Possible Backups Using VMware Consolidated Backup A Design Blueprint George Winter Technical Product Manager NetBackup Symantec Corporation Agenda Overview NetBackup for VMware and
More informationBackground. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW
Virtual Machines Background IBM sold expensive mainframes to large organizations Some wanted to run different OSes at the same time (because applications were developed on old OSes) Solution: IBM developed
More informationMehmet.Gonullu@Veeam.com Veeam Portfolio - Agentless backup and replication for VMware and Hyper-V - Scalable, powerful, easy-to-use, affordable - Storage agnostic High-speed Recovery instant VM recovery
More informationA Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer
A Crash Course In Wide Area Data Replication Jacob Farmer, CTO, Cambridge Computer SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals
More informationRemusDB: transparent high availability for database systems
The VLDB Journal (213) 22:29 45 DOI 1.17/s778-12-294-6 SPECIAL ISSUE PAPER RemusDB: transparent high availability for database systems Umar Farooq Minhas Shriram Rajagopalan Brendan Cully Ashraf Aboulnaga
More informationProxySQL - GTID Consistent Reads. Adaptive query routing based on GTID tracking
ProxySQL - GTID Consistent Reads Adaptive query routing based on GTID tracking Introduction Rene Cannao Founder of ProxySQL MySQL DBA Introduction Nick Vyzas ProxySQL Committer MySQL DBA What is ProxySQL?
More informationMEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS
Department of Computer Science Institute of System Architecture, Operating Systems Group MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS SWAMINATHAN SUNDARARAMAN, SRIRAM SUBRAMANIAN, ABHISHEK
More informationLITE Kernel RDMA. Support for Datacenter Applications. Shin-Yeh Tsai, Yiying Zhang
LITE Kernel RDMA Support for Datacenter Applications Shin-Yeh Tsai, Yiying Zhang Time 2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley Socket TCP Offload engine Arrakis & mtcp IX Userspace
More informationRDMA Migration and RDMA Fault Tolerance for QEMU
RDMA Migration and RDMA Fault Tolerance for QEMU Michael R. Hines IBM mrhines@us.ibm.com http://wiki.qemu.org/features/rdmalivemigration http://wiki.qemu.org/features/microcheckpointing 1 Migration design
More informationStatus Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service) eddie.dong@intel.com arei.gonglei@huawei.com yanghy@cn.fujitsu.com Agenda Background Introduction Of COLO
More informationIsoStack Highly Efficient Network Processing on Dedicated Cores
IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single
More informationAsynchronous Events on Linux
Asynchronous Events on Linux Frederic.Rossi@Ericsson.CA Open System Lab Systems Research June 25, 2002 Ericsson Research Canada Introduction Linux performs well as a general purpose OS but doesn t satisfy
More informationAbstractions for Practical Virtual Machine Replay. Anton Burtsev, David Johnson, Mike Hibler, Eric Eride, John Regehr University of Utah
Abstractions for Practical Virtual Machine Replay Anton Burtsev, David Johnson, Mike Hibler, Eric Eride, John Regehr University of Utah 2 3 Number of systems supporting replay: 0 Determinism 4 CPU is deterministic
More informationHedvig as backup target for Veeam
Hedvig as backup target for Veeam Solution Whitepaper Version 1.0 April 2018 Table of contents Executive overview... 3 Introduction... 3 Solution components... 4 Hedvig... 4 Hedvig Virtual Disk (vdisk)...
More informationZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1
ZEST Snapshot Service A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1 Design Motivation To optimize science utilization of the machine Maximize
More informationXen Community Update. Ian Pratt, Citrix Systems and Chairman of Xen.org
Xen Community Update Ian Pratt, Citrix Systems and Chairman of Xen.org 1 Outline Project Status Xen Client Initiative Xen Cloud Platform New Xen 4.0 Features 2 Announcement The Xen Advisory Board is excited
More informationIO-Lite: A Unified I/O Buffering and Caching System
IO-Lite: A Unified I/O Buffering and Caching System Vivek S. Pai, Peter Druschel and Willy Zwaenepoel Rice University (Presented by Chuanpeng Li) 2005-4-25 CS458 Presentation 1 IO-Lite Motivation Network
More informationMicroservices, Messaging and Science Gateways. Review microservices for science gateways and then discuss messaging systems.
Microservices, Messaging and Science Gateways Review microservices for science gateways and then discuss messaging systems. Micro- Services Distributed Systems DevOps The Gateway Octopus Diagram Browser
More informationEngineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego
Engineering Fault-Tolerant TCP/IP servers using FT-TCP Dmitrii Zagorodnov University of California San Diego Motivation Reliable network services are desirable but costly! Extra and/or specialized hardware
More informationDRBD 9. Lars Ellenberg. Linux Storage Replication. LINBIT HA Solutions GmbH Vienna, Austria
DRBD 9 Linux Storage Replication Lars Ellenberg LINBIT HA Solutions GmbH Vienna, Austria What this talk is about What is replication Why block level replication Why replication What do we have to deal
More informationMySQL High Availability. Michael Messina Senior Managing Consultant, Rolta-AdvizeX /
MySQL High Availability Michael Messina Senior Managing Consultant, Rolta-AdvizeX mmessina@advizex.com / mike.messina@rolta.com Introduction Michael Messina Senior Managing Consultant Rolta-AdvizeX, Working
More informationMobile Operating Systems Lesson 04 PalmOS Part 2
Mobile Operating Systems Lesson 04 PalmOS Part 2 Oxford University Press 2007. All rights reserved. 1 PalmOS Memory Support Assumes that there is a 256 MB memory card(s) The card RAM, ROM, and flash memories
More informationThe Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines
The Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines Daniel J. Scales (VMware) scales@vmware.com Mike Nelson (VMware) mnelson@vmware.com Ganesh Venkitachalam (VMware) ganesh@vmware.com
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationLightweight Live Migration for High Availability Cluster Service
Lightweight Live Migration for High Availability Cluster Service Bo Jiang 1, Binoy Ravindran 1, and Changsoo Kim 2 1 ECE Dept., Virginia Tech {bjiang,binoy}@vt.edu 2 ETRI, Daejeon, South Korea cskim7@etri.re.kr
More informationAdaptive Cluster Computing using JavaSpaces
Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of
More informationStatus Update About COLO FT
Status Update About COLO FT www.huawei.com Hailiang Zhang (Huawei) Randy Han (Huawei) Agenda Introduce COarse-grain LOck-stepping COLO Design and Technology Details Current Status Of COLO In KVM Further
More informationEnhancing the Performance of High Availability Lightweight Live Migration
Enhancing the Performance of High Availability Lightweight Live Migration Peng Lu 1, Binoy Ravindran 1, and Changsoo Kim 2 1 ECE Dept., Virginia Tech {lvpeng,binoy}@vt.edu 2 ETRI, Daejeon, South Korea
More informationOutline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles
INF3190:Distributed Systems - Examples Thomas Plagemann & Roman Vitenberg Outline Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles Today: Examples Googel File System (Thomas)
More informationRemote Procedure Call (RPC) and Transparency
Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS GZ03 / M030 10 th October 2014 Transparency in Distributed Systems Programmers accustomed to writing code for a single box
More informationDistributed System. Gang Wu. Spring,2018
Distributed System Gang Wu Spring,2018 Lecture4:Failure& Fault-tolerant Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the
More informationGeorgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong
Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong Relatively recent; still applicable today GFS: Google s storage platform for the generation and processing of data used by services
More informationLecture XIII: Replication-II
Lecture XIII: Replication-II CMPT 401 Summer 2007 Dr. Alexandra Fedorova Outline Google File System A real replicated file system Paxos Harp A consensus algorithm used in real systems A replicated research
More informationCycle Sharing Systems
Cycle Sharing Systems Jagadeesh Dyaberi Dependable Computing Systems Lab Purdue University 10/31/2005 1 Introduction Design of Program Security Communication Architecture Implementation Conclusion Outline
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google* 정학수, 최주영 1 Outline Introduction Design Overview System Interactions Master Operation Fault Tolerance and Diagnosis Conclusions
More informationINTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION
Installing and Configuring the DM-MPIO WHITE PAPER INTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION Abstract This white paper introduces XtremIO replication on X2 platforms. XtremIO replication leverages
More informationHow VoltDB does Transactions
TEHNIL NOTE How does Transactions Note to readers: Some content about read-only transaction coordination in this document is out of date due to changes made in v6.4 as a result of Jepsen testing. Updates
More informationPetal and Frangipani
Petal and Frangipani Petal/Frangipani NFS NAS Frangipani SAN Petal Petal/Frangipani Untrusted OS-agnostic NFS FS semantics Sharing/coordination Frangipani Disk aggregation ( bricks ) Filesystem-agnostic
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationData Storage Revolution
Data Storage Revolution Relational Databases Object Storage (put/get) Dynamo PNUTS CouchDB MemcacheDB Cassandra Speed Scalability Availability Throughput No Complexity Eventual Consistency Write Request
More informationCSE543 - Computer and Network Security Module: Virtualization
CSE543 - Computer and Network Security Module: Virtualization Professor Trent Jaeger CSE543 - Introduction to Computer and Network Security 1 1 Operating System Quandary Q: What is the primary goal of
More informationOperating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group
Operating Systems (2INC0) 20/19 Introduction (01) Dr. Courtesy of Prof. Dr. Johan Lukkien System Architecture and Networking Group Course Overview Introduction to operating systems Processes, threads and
More informationstatus Emmanuel Cecchet
status Emmanuel Cecchet c-jdbc@objectweb.org JOnAS developer workshop http://www.objectweb.org - c-jdbc@objectweb.org 1-23/02/2004 Outline Overview Advanced concepts Query caching Horizontal scalability
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationVMware vsphere 5.5 Professional Bootcamp
VMware vsphere 5.5 Professional Bootcamp Course Overview Course Objectives Cont. VMware vsphere 5.5 Professional Bootcamp is our most popular proprietary 5 Day course with more hands-on labs (100+) and
More informationCross Data Center Replication in Apache Solr. Anvi Jain, Software Engineer II Amrit Sarkar, Search Engineer
Cross Data Center Replication in Apache Solr Anvi Jain, Software Engineer II Amrit Sarkar, Search Engineer Who are we? Based in Bedford, MA. Offices all around the world Progress tools and platforms enable
More informationOperating Systems 2010/2011
Operating Systems 2010/2011 Introduction Johan Lukkien 1 Agenda OS: place in the system Some common notions Motivation & OS tasks Extra-functional requirements Course overview Read chapters 1 + 2 2 A computer
More informationAurora, RDS, or On-Prem, Which is right for you
Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation
More informationQEMU Backup. Maxim Nestratov, Virtuozzo Vladimir Sementsov-Ogievskiy, Virtuozzo
QEMU Backup Maxim Nestratov, Virtuozzo Vladimir Sementsov-Ogievskiy, Virtuozzo QEMU Backup Vladimir Sementsov-Ogievskiy, Virtuozzo Full featured backup Online backup Fast Not very invasive for the guest
More informationAn Empirical Study of High Availability in Stream Processing Systems
An Empirical Study of High Availability in Stream Processing Systems Yu Gu, Zhe Zhang, Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu Stream Processing Model software operators (PEs) Ω Unexpected machine
More information