An Evaluation of Checkpoint Recovery For Massively Multiplayer Online Games

Size: px
Start display at page:

Download "An Evaluation of Checkpoint Recovery For Massively Multiplayer Online Games"

Transcription

1 An Evaluation of Checkpoint Recovery For Massively Multiplayer Online Games Marcos Vaz Salles Tuan Cao Benjamin Sowell Alan Demers Johannes Gehrke Christoph Koch Walker White

2 MMO Games Simulate long-lived, interactive virtual worlds Persist across user sessions

3 MMO Games Simulate long-lived, interactive virtual worlds Can existing DBMS main-memory checkpointing techniques help to achieve low cost durability? Persist across user sessions

4 Our Contributions Show how to adapt consistent checkpointing techniques Provide a thorough simulation model and evaluate six different recovery strategies

5 Our Contributions Show how to adapt consistent checkpointing techniques Provide a thorough simulation model and evaluate six different recovery strategies Why simulation? Ability to evaluate different types of hardware

6 Talk Outline Introduction to MMOs The game state The game loop Consistent Checkpointing Techniques Requirements Performance Criteria Performance Model Experiments Experimental Setup Model Validation Results Conclusion

7 The Game State Tables containing game objects (e.g., attributes of game characters) Must be kept in main memory

8 The Game State

9 The Game Loop Update Receive player input Process player actions Handle updates 30 ticks/sec Draw The game state is consistent at the end of every tick. 7

10 The Game Loop 30 ticks/sec Update Draw Receive player input Process player actions Handle updates Perform checkpointing 33ms The game state is consistent at the end of every tick. 7

11 Requirements Small overhead Entire checkpointing process must fit into the game simulation Uniform overhead Low latency, no hiccup in the game No data loss Recover to the point of the crash

12 Requirements Small overhead Uniform overhead No data loss

13 Requirements Performance Criteria Small overhead Average Overhead Time Uniform overhead Overhead Distribution No data loss Checkpointing Time Recovery Time

14 Checkpointing Techniques All Objects Dirty Objects Log Based Double Backup Eager Copy Copy On Update

15 Checkpointing Techniques All Objects Log Eager Copy Dirty Objects Double Backups Copy On Update

16 Checkpointing Techniques All Objects Log Eager Copy Dirty Objects Double Backups Copy On Update

17 Checkpointing Techniques All Objects Log Eager Copy Dirty Objects Double Backups Copy On Update Dirty Objects

18 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Log-Based

19 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Log-Based

20 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Double Backup

21 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Double Backup Containing a consistent copy of the game state

22 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Double Backup Containing a consistent copy of the game state

23 Checkpointing Techniques All Objects Log Based Eager Copy Dirty Objects Double Backup Copy On Update Double Backup Containing a consistent copy of the game state

24 Checkpointing Techniques All Objects Log Eager Copy Dirty Objects Double Backups Copy On Update Update Eager Copy Draw Update Draw Pause the simulation loop Copy objects to memory Creates a hiccup in the game 15

25 Checkpointing Techniques All Objects Log Eager Copy Dirty Objects Double Backups Copy On Update Update Copy On Update Draw Update Intercept each update Copy object to memory before updating it Draw Overhead is spread over ticks 16

26 Checkpointing Algorithms All Objects Dirty Objects Eager Copy Copy On Update Double Backup Log Naive-Snapshot X X X Copy-On-Update Dribble-And- Atomic-Copy- Dirty-Objects X X X X X X Partial-Redo X X X Copy-On-Update X X X Copy-On- Update-Partial- Redo X X X

27 Checkpointing Algorithms All Objects Dirty Objects Eager Copy Copy On Update Double Backup Log Naive-Snapshot X X X Copy-On-Update Dribble-And- Atomic-Copy- Dirty-Objects X X X X X X Partial-Redo X X X Copy-On-Update X X X Copy-On- Update-Partial- Redo X X X

28 Checkpointing Algorithms Naive-Snapshot Partial-Redo Copy-On-Update All Objects Dirty Objects All Objects Dirty Objects All Objects Dirty Objects Log Double Backup Log Double Backup Log Double Backup Eager Copy COU Eager Copy COU Eager Copy COU

29 All Objects Dirty Objects Naive-Snapshot Log Eager Copy Double Backup COU Update Draw Update Draw

30 All Objects Dirty Objects Naive-Snapshot Log Eager Copy Double Backup COU Update Draw Update Draw

31 All Objects Dirty Objects Partial-Redo Log Eager Copy Double Backup COU Update Draw Update Draw

32 All Objects Dirty Objects Partial-Redo Log Eager Copy Double Backup COU Update Draw Update Draw

33 All Objects Dirty Objects Copy-On-Update Log Eager Copy Double Backup COU Update Draw Copy object to main memory before updating it Update Draw

34 All Objects Dirty Objects Copy-On-Update Log Eager Copy Double Backup COU Update Draw Copy object to main memory before updating it Update Draw

35 All Objects Dirty Objects Copy-On-Update Log Eager Copy Double Backup COU Update Draw Copy object to main memory before updating it Update Draw

36 All Objects Dirty Objects Copy-On-Update Log Eager Copy Double Backup COU Update Draw Copy object to main memory before updating it Update Draw

37 Performance Model Components In main-memory copy time Disk write time Update handler overhead Recovery Time

38 Performance Model Components In main-memory copy time Disk write time Update handler overhead Recovery Time T memcopy (k) = k S obj B mem T disk-write (k) = k S obj B disk T overhead = O bit + O lock + T memcopy T recovery = T restore + T replay T restore (Partial-Redos) = (k C + n) S obj B disk T restore (The other algorithms) = n S obj B disk

39 Update Handler Overhead T overhead = O bit + O lock + T memcopy O bit O lock T memcopy Overhead to test a dirty bit The cost to lock an object Cost of a memory copy of an atomic object T memcopy = S obj B mem S obj B mem Size of an object Memory bandwidth

40 Experimental Validation Implemented Naive-Snapshot and Copy-On-Update in C++ Running on our high-end server Intel Core 2 Quad 2.4 Ghz 4MB cache 4GB RAM Ubuntu (kernel ) 80GB 7200rpm SATA drive with 8MB cache Game ticks operate at 30Hz

41 Overhead Time

42 Checkpointing Time

43 Recovery Time

44 Experiments Scaling with update rates Prototype game

45 Data Set 1: Synthetic Trace Generate updates according to a Zipf distribution Vary the number of updates per tick from 1k to 256k updates per tick

46 Data Set 2: Prototype Game Trace We simulated a medieval battle of the type common in many MMOs Knights, archers and healers, divided into two teams The objective is to defeat as many enemies as possible Refer to

47 Simulation Parameters Memory Bandwidth Memory Latency Lock Overhead Bit test/set Overhead Disk Bandwidth B mem O mem O lock O bit B disk From our micro-benchmarks running on Intel Core 2 Quad 2.4 Ghz 4MB cache 4GB RAM Ubuntu (kernel ) 80GB 7200rpm SATA drive with 8MB cache

48 Performance Criteria Average Overhead Time Overhead Distribution Recovery Time Checkpointing Time

49 Scaling on Number Of Updates Per Tick (Overhead Time) Avg. Overhead Time [sec], logscale Naive Snapshot Partial Redo Copy-on-Update # Updates per Tick, logscale

50 Scaling on Number Of Updates Per Tick (Overhead Time) Avg. Overhead Time [sec], logscale Naive Snapshot Partial Redo Copy-on-Update # Updates per Tick, logscale

51 Latency Analysis (8k updates per tick) Tick Length [sec], logscale Latency Limit Naive Snapshot Partial Redo Copy-on-Update Tick #

52 Scaling on Number Of Updates Per Tick (Overhead Time) Avg. Overhead Time [sec], logscale Naive Snapshot Partial Redo Copy-on-Update # Updates per Tick, logscale

53 Scaling on Number Of Updates Per Tick (Overhead Time) Avg. Overhead Time [sec], logscale Naive Snapshot Partial Redo Copy-on-Update # Updates per Tick, logscale

54 Latency Analysis (256k updates per tick) Tick Length [sec], logscale Latency Limit Naive Snapshot Partial Redo Copy-on-Update Tick #

55 Scaling on Number Of Updates Per Tick (Checkpointing Time) Avg. Time to Checkpoint [sec], logscale Naive Snapshot Partial-Redo Copy-on-Update # Updates per Tick, logscale

56 Scaling on Number Of Updates Per Tick (Recovery Time) Est. Recovery Time [sec], logscale Naive Snapshot Partial Redo Copy-on-Update # Updates per Tick, logscale

57 Experiments with Prototype Game Server (Overhead Time)

58 Experiments with Prototype Game Server (Checkpointing Time)

59 Experiments with Prototype Game Server (Recovery Time)

60 Conclusion Methods based on double backup recover much faster than methods checkpointing to logs For moderate update rates, the best method in terms of both latency and recovery time is Copy-On-Update In case of extremely high update rates, we recommend Naive-Snapshot Available to download at :

61 References K. Salem and H. Garcia-Molina. Checkpointing Memory-Resident Databases. In Proc. ICDE, W. White, A. Demers, C. Koch, J. Gehrke, and R. Rajagopalan. Scaling Games to Epic Proportions. In Proc. SIGMOD, G. Bronevetsky, R. Fernandes, D. Marques, K. Pingali, and P. Stodghill. Recent Advances in Checkpoint/Recovery Systems. In Proc. IPDPS, D. J. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, and D. Wood. Implementation Techniques for Main Memory DatabaseSystems. In Proc. SIGMOD, D. Rosenkrantz. Dynamic Database Dumping. In Proc. SIGMOD, Richard Fujimoto. Parallel and Distributed Simulation Systems. John Wiley & Sons, 2000.

62 Conclusion Methods based on double backup recover much faster than methods checkpointing to logs For moderate update rates, the best method in terms of both latency and recovery time is Copy-On-Update In case of extremely high update rates, we recommend Naive-Snapshot Available to download at :

63 SSD - Scaling on Number Of Updates Per Tick (Overhead Time) Avg. Overhead Time [sec], logscale Naive Snapshot Dribble and Copy on Update Atomic Copy Dirty Objects Partial Redo Copy-on-Update Copy-on-Update-Partial-Redo # Updates per Tick, logscale

64 SSD - Latency Analysis (64k updates per tick) Tick Length [sec], logscale Latency Limit Naive Snapshot Dribble and Copy on Update Atomic Copy Dirty Objects Partial Redo Copy-on-Update Copy-on-Update-Partial-Redo Tick #

65 SSD - Scaling on Number Of Updates Per Tick (Checkpointing Time) Avg. Time to Checkpoint [sec], logscale Naive Snapshot Dribble and Copy on Update Atomic-Copy-Dirty-Objects Partial-Redo Copy-on-Update Copy-on-Update-Partial-Redo # Updates per Tick, logscale

66 SSD - Scaling on Number Of Updates Per Tick (Recovery Time) Est. Recovery Time [sec], logscale Naive Snapshot Dribble and Copy on Update Atomic Copy Dirty Objects Partial Redo Copy-on-Update Copy-on-Update-Partial-Redo # Updates per Tick, logscale

Behavioral Simulations in MapReduce

Behavioral Simulations in MapReduce Behavioral Simulations in MapReduce Guozhang Wang, Cornell University with Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, and Walker White MSR DMX August 20, 2010

More information

Database Management System

Database Management System Database Management System Lecture 10 Recovery * Some materials adapted from R. Ramakrishnan, J. Gehrke and Shawn Bowers Basic Database Architecture Database Management System 2 Recovery Which ACID properties

More information

By Nitin Gupta, Alan Demers, Johannes Gehrke, Philipp Unterbrunner, Walker White at ICDE 2009

By Nitin Gupta, Alan Demers, Johannes Gehrke, Philipp Unterbrunner, Walker White at ICDE 2009 1 Scalability for Virtual Worlds By Nitin Gupta, Alan Demers, Johannes Gehrke, Philipp Unterbrunner, Walker White at ICDE 2009 Presented By, Mayuri Khardikar Net-VEs 2 Networked Virtual Environments A

More information

Redo Log Removal Mechanism for NVRAM Log Buffer

Redo Log Removal Mechanism for NVRAM Log Buffer Redo Log Removal Mechanism for NVRAM Log Buffer 2009/10/20 Kyoung-Gu Woo, Heegyu Jin Advanced Software Research Laboratories SAIT, Samsung Electronics Contents Background Basic Philosophy Proof of Correctness

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School

More information

Accommodating Logical Logging under Fuzzy Checkpointing in Main Memory Databases y

Accommodating Logical Logging under Fuzzy Checkpointing in Main Memory Databases y Accommodating Logical Logging under Fuzzy Checkpointing in Main Memory Databases y Seungkyoon Woo Myoung Ho Kim Yoon Joon Lee Department of Computer Science Korea Advanced Institute of Science and Technology

More information

SFS: Random Write Considered Harmful in Solid State Drives

SFS: Random Write Considered Harmful in Solid State Drives SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Recoverability. Kathleen Durant PhD CS3200

Recoverability. Kathleen Durant PhD CS3200 Recoverability Kathleen Durant PhD CS3200 1 Recovery Manager Recovery manager ensures the ACID principles of atomicity and durability Atomicity: either all actions in a transaction are done or none are

More information

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us) 1 STORAGE LATENCY 2 RAMAC 350 (600 ms) 1956 10 5 x NAND SSD (60 us) 2016 COMPUTE LATENCY 3 RAMAC 305 (100 Hz) 1956 10 8 x 1000x CORE I7 (1 GHZ) 2016 NON-VOLATILE MEMORY 1000x faster than NAND 3D XPOINT

More information

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University Falcon: Scaling IO Performance in Multi-SSD Volumes Pradeep Kumar H Howie Huang The George Washington University SSDs in Big Data Applications Recent trends advocate using many SSDs for higher throughput

More information

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI PART 1 2 RECOVERY Topics 3 Introduction Transactions Transaction Log System Recovery Media Recovery Introduction

More information

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store Yiming Zhang, Rui Chu @ NUDT Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Haitao Wu @ MSRA June, 2012 1 Background Disk-based storage

More information

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo Lecture 21: Logging Schemes 15-445/645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo Crash Recovery Recovery algorithms are techniques to ensure database consistency, transaction

More information

Join Processing for Flash SSDs: Remembering Past Lessons

Join Processing for Flash SSDs: Remembering Past Lessons Join Processing for Flash SSDs: Remembering Past Lessons Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison $/MB GB Flash Solid State Drives (SSDs) Benefits of

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E)

RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E) RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E) 2 LECTURE OUTLINE Failures Recoverable schedules Transaction logs Recovery procedure 3 PURPOSE OF DATABASE RECOVERY To bring the database into the most

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google* 정학수, 최주영 1 Outline Introduction Design Overview System Interactions Master Operation Fault Tolerance and Diagnosis Conclusions

More information

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

Chapter 14: Recovery System

Chapter 14: Recovery System Chapter 14: Recovery System Chapter 14: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery Remote Backup Systems Failure Classification Transaction failure

More information

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance

More information

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction

More information

Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas

Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas Massachusetts Institute of Technology November 8th, CCSW 2013 Cloud Storage Model

More information

Advanced Memory Management

Advanced Memory Management Advanced Memory Management Main Points Applications of memory management What can we do with ability to trap on memory references to individual pages? File systems and persistent storage Goals Abstractions

More information

Hello, and welcome to this online, self-paced course module covering Exadata Smart Flash Log. My name is Peter Fusek. I am a curriculum developer at

Hello, and welcome to this online, self-paced course module covering Exadata Smart Flash Log. My name is Peter Fusek. I am a curriculum developer at Hello, and welcome to this online, self-paced course module covering Exadata Smart Flash Log. My name is Peter Fusek. I am a curriculum developer at Oracle, and in various roles I have helped customers

More information

Database Technology. Topic 11: Database Recovery

Database Technology. Topic 11: Database Recovery Topic 11: Database Recovery Olaf Hartig olaf.hartig@liu.se Types of Failures Database may become unavailable for use due to: Transaction failures e.g., incorrect input, deadlock, incorrect synchronization

More information

File System Management

File System Management Lecture 8: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation

More information

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method Outline Aries: A Transaction Recovery Method Presented by Haoran Song Discussion by Hoyt Purpose of this paper Computer system is crashed as easily as other devices. Disk burned Software Errors Fires or

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

Ecient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University

Ecient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University Technical Report 96-CSE-13 Ecient Redo Processing in Main Memory Databases by Jun-Lin Lin Margaret H. Dunham Xi Li Department of Computer Science and Engineering Southern Methodist University Dallas, Texas

More information

Qsync. Cross-device File Sync for Optimal Teamwork. Share your life and work

Qsync. Cross-device File Sync for Optimal Teamwork. Share your life and work Qsync Cross-device File Sync for Optimal Teamwork Share your life and work Agenda Users' common issues QNAP NAS specifications recommended by various types of users Usage scenarios and Qsync application

More information

Chapter 6. Storage and Other I/O Topics

Chapter 6. Storage and Other I/O Topics Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Bogdan Nicolae (IBM Research, Ireland) Pierre Riteau (University of Chicago, USA) Kate Keahey (Argonne National

More information

SMORE: A Cold Data Object Store for SMR Drives

SMORE: A Cold Data Object Store for SMR Drives SMORE: A Cold Data Object Store for SMR Drives Peter Macko, Xiongzi Ge, John Haskins Jr.*, James Kelley, David Slik, Keith A. Smith, and Maxim G. Smith Advanced Technology Group NetApp, Inc. * Qualcomm

More information

Indexing in RAMCloud. Ankita Kejriwal, Ashish Gupta, Arjun Gopalan, John Ousterhout. Stanford University

Indexing in RAMCloud. Ankita Kejriwal, Ashish Gupta, Arjun Gopalan, John Ousterhout. Stanford University Indexing in RAMCloud Ankita Kejriwal, Ashish Gupta, Arjun Gopalan, John Ousterhout Stanford University RAMCloud 1.0 Introduction Higher-level data models Without sacrificing latency and scalability Secondary

More information

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research

More information

White Paper. File System Throughput Performance on RedHawk Linux

White Paper. File System Throughput Performance on RedHawk Linux White Paper File System Throughput Performance on RedHawk Linux By: Nikhil Nanal Concurrent Computer Corporation August Introduction This paper reports the throughput performance of the,, and file systems

More information

Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience

Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience Onkar Patil 1, Saurabh Hukerikar 2, Frank Mueller 1, Christian Engelmann 2 1 Dept. of Computer Science, North Carolina State University

More information

SICV Snapshot Isolation with Co-Located Versions

SICV Snapshot Isolation with Co-Located Versions SICV Snapshot Isolation with Co-Located Versions Robert Gottstein, Ilia Petrov, Alejandro Buchmann {lastname}@dvs.tu-darmstadt.de Databases and Distributed Systems Robert Gottstein, Ilia Petrov, Alejandro

More information

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 A performance study with NVDIMM-N Dell EMC Engineering September 2017 A Dell EMC document category Revisions Date

More information

Recovering Disk Storage Metrics from low level Trace events

Recovering Disk Storage Metrics from low level Trace events Recovering Disk Storage Metrics from low level Trace events Progress Report Meeting May 05, 2016 Houssem Daoud Michel Dagenais École Polytechnique de Montréal Laboratoire DORSAL Agenda Introduction and

More information

SGL: A Scalable Language for Data-Driven Games

SGL: A Scalable Language for Data-Driven Games SGL: A Scalable Language for Data-Driven Games [Demonstration Paper] Robert Albright, Alan Demers, Johannes Gehrke, Nitin Gupta, Hooyeon Lee, Rick Keilty, Gregory Sadowski, Ben Sowell, Walker White Cornell

More information

Chapter 16: Recovery System. Chapter 16: Recovery System

Chapter 16: Recovery System. Chapter 16: Recovery System Chapter 16: Recovery System Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 16: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based

More information

Bottleneck Hunters: How Schooner increased MySQL throughput by more than 800% Jeremy Cole

Bottleneck Hunters: How Schooner increased MySQL throughput by more than 800% Jeremy Cole Bottleneck Hunters: How Schooner increased MySQL throughput by more than 800% Jeremy Cole On the genesis of Schooner: Hardware is massively under-utilized I/O has long

More information

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary

More information

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique Mohammad Alshboul, James Tuck, and Yan Solihin Email: maalshbo@ncsu.edu ARPERS Research Group Introduction Future

More information

EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University

EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors Junfeng Yang, Can Sar, Dawson Engler Stanford University Why check storage systems? Storage system errors are among the

More information

Outline. Failure Types

Outline. Failure Types Outline Database Tuning Nikolaus Augsten University of Salzburg Department of Computer Science Database Group 1 Unit 10 WS 2013/2014 Adapted from Database Tuning by Dennis Shasha and Philippe Bonnet. Nikolaus

More information

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory JOY ARULRAJ JUSTIN LEVANDOSKI UMAR FAROOQ MINHAS PER-AKE LARSON Microsoft Research NON-VOLATILE MEMORY [NVM] PERFORMANCE DRAM VOLATILE

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University I/O System Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction (1) I/O devices can be characterized by Behavior: input, output, storage

More information

Basics of SQL Transactions

Basics of SQL Transactions www.dbtechnet.org Basics of SQL Transactions Big Picture for understanding COMMIT and ROLLBACK of SQL transactions Files, Buffers,, Service Threads, and Transactions (Flat) SQL Transaction [BEGIN TRANSACTION]

More information

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File

More information

Towards Real-Time, Many Task Applications on Large Distributed Systems

Towards Real-Time, Many Task Applications on Large Distributed Systems Towards Real-Time, Many Task Applications on Large Distributed Systems - focusing on the implementation of RT-BOINC Sangho Yi (sangho.yi@inria.fr) Content Motivation and Background RT-BOINC in a nutshell

More information

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, Michael

More information

ASN Configuration Best Practices

ASN Configuration Best Practices ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows.

More information

Engineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego

Engineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego Engineering Fault-Tolerant TCP/IP servers using FT-TCP Dmitrii Zagorodnov University of California San Diego Motivation Reliable network services are desirable but costly! Extra and/or specialized hardware

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part XIII Lecture 21, April 14, 2014 Mohammad Hammoud Today Last Session: Transaction Management (Cont d) Today s Session: Transaction Management (finish)

More information

Benchmark: In-Memory Database System (IMDS) Deployed on NVDIMM

Benchmark: In-Memory Database System (IMDS) Deployed on NVDIMM Benchmark: In-Memory Database System (IMDS) Deployed on NVDIMM Presented by Steve Graves, McObject and Jeff Chang, AgigA Tech Santa Clara, CA 1 The Problem: Memory Latency NON-VOLATILE MEMORY HIERARCHY

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Two hours - online. The exam will be taken on line. This paper version is made available as a backup

Two hours - online. The exam will be taken on line. This paper version is made available as a backup COMP 25212 Two hours - online The exam will be taken on line. This paper version is made available as a backup UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE System Architecture Date: Monday 21st

More information

Weak Levels of Consistency

Weak Levels of Consistency Weak Levels of Consistency - Some applications are willing to live with weak levels of consistency, allowing schedules that are not serialisable E.g. a read-only transaction that wants to get an approximate

More information

Database Recovery. Haengrae Cho Yeungnam University. Database recovery. Introduction to Database Systems

Database Recovery. Haengrae Cho Yeungnam University. Database recovery. Introduction to Database Systems Database Recovery Haengrae Cho Yeungnam University Database recovery. Introduction to Database Systems Report Yeungnam University, Database Lab. Chapter 1-1 1. Introduction to Database Recovery 2. Recovery

More information

The Checkpoint Mechanism in KeyKOS

The Checkpoint Mechanism in KeyKOS The Checkpoint Mechanism in KeyKOS Charles R. Landau MACS Lab, Inc., 2454 Brannan Place, Santa Clara, CA 95050 KeyKOS is an object-oriented microkernel operating system. KeyKOS achieves persistence of

More information

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong Relatively recent; still applicable today GFS: Google s storage platform for the generation and processing of data used by services

More information

NVthreads: Practical Persistence for Multi-threaded Applications

NVthreads: Practical Persistence for Multi-threaded Applications NVthreads: Practical Persistence for Multi-threaded Applications Terry Hsu*, Purdue University Helge Brügner*, TU München Indrajit Roy*, Google Inc. Kimberly Keeton, Hewlett Packard Labs Patrick Eugster,

More information

MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS

MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS Department of Computer Science Institute of System Architecture, Operating Systems Group MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS SWAMINATHAN SUNDARARAMAN, SRIRAM SUBRAMANIAN, ABHISHEK

More information

Workload Performance Comparison. Eden Kim, Calypso Systems, Inc.

Workload Performance Comparison. Eden Kim, Calypso Systems, Inc. Workload Performance Comparison Eden Kim, Calypso Systems, Inc. Agenda Part 1: Workload Comparisons: Part 2: NVMe SSD, & Mmap Part 3: DAX: Conclusions 2 Part 1: Workload Comparisons Synthetic Benchmark:

More information

An Empirical Study of High Availability in Stream Processing Systems

An Empirical Study of High Availability in Stream Processing Systems An Empirical Study of High Availability in Stream Processing Systems Yu Gu, Zhe Zhang, Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu Stream Processing Model software operators (PEs) Ω Unexpected machine

More information

ARIES (& Logging) April 2-4, 2018

ARIES (& Logging) April 2-4, 2018 ARIES (& Logging) April 2-4, 2018 1 What does it mean for a transaction to be committed? 2 If commit returns successfully, the transaction is recorded completely (atomicity) left the database in a stable

More information

Adaptec MaxIQ SSD Cache Performance Solution for Web Server Environments Analysis

Adaptec MaxIQ SSD Cache Performance Solution for Web Server Environments Analysis Adaptec MaxIQ SSD Cache Performance Solution for Web Server Environments Analysis September 22, 2009 Page 1 of 7 Introduction Adaptec has requested an evaluation of the performance of the Adaptec MaxIQ

More information

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke Crash Recovery CMPSCI 645 Gerome Miklau Slide content adapted from Ramakrishnan & Gehrke 1 Review: the ACID Properties Database systems ensure the ACID properties: Atomicity: all operations of transaction

More information

TFS: A Transparent File System for Contributory Storage

TFS: A Transparent File System for Contributory Storage TFS: A Transparent File System for Contributory Storage James Cipar, Mark Corner, Emery Berger http://prisms.cs.umass.edu/tcsm University of Massachusetts, Amherst Contributory Applications Users contribute

More information

REMEM: REmote MEMory as Checkpointing Storage

REMEM: REmote MEMory as Checkpointing Storage REMEM: REmote MEMory as Checkpointing Storage Hui Jin Illinois Institute of Technology Xian-He Sun Illinois Institute of Technology Yong Chen Oak Ridge National Laboratory Tao Ke Illinois Institute of

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Level 2 Diploma Unit 3 Computer Systems

Level 2 Diploma Unit 3 Computer Systems Level 2 Diploma Unit 3 Computer Systems You are an IT technician in a small company which creates web sites. The company has recently employed someone who is partially sighted and is also left handed.

More information

Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources

Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources Ming Zhao, Renato J. Figueiredo Advanced Computing and Information Systems (ACIS) Electrical and Computer

More information

Chapter 17: Recovery System

Chapter 17: Recovery System Chapter 17: Recovery System Database System Concepts See www.db-book.com for conditions on re-use Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery

More information

Chapter 9. Recovery. Database Systems p. 368/557

Chapter 9. Recovery. Database Systems p. 368/557 Chapter 9 Recovery Database Systems p. 368/557 Recovery An important task of a DBMS is to avoid losing data caused by a system crash The recovery system of a DBMS utilizes two important mechanisms: Backups

More information

DBMS Data Loading: An Analysis on Modern Hardware. Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki

DBMS Data Loading: An Analysis on Modern Hardware. Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki DBMS Data Loading: An Analysis on Modern Hardware Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki Data loading: A necessary evil Volume => Expensive 4 zettabytes

More information

ijournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call

ijournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call ijournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call Daejun Park and Dongkun Shin, Sungkyunkwan University, Korea https://www.usenix.org/conference/atc17/technical-sessions/presentation/park

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme FUT3040BU Storage at Memory Speed: Finally, Nonvolatile Memory Is Here Rajesh Venkatasubramanian, VMware, Inc Richard A Brunner, VMware, Inc #VMworld #FUT3040BU Disclaimer This presentation may contain

More information

[14] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A transaction recovery

[14] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A transaction recovery [14] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A transaction recovery method supporting ne-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions

More information

Transactifying Apache s Cache Module

Transactifying Apache s Cache Module H. Eran O. Lutzky Z. Guz I. Keidar Department of Electrical Engineering Technion Israel Institute of Technology SYSTOR 2009 The Israeli Experimental Systems Conference Outline 1 Why legacy applications

More information

The Dangers and Complexities of SQLite Benchmarking. Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram

The Dangers and Complexities of SQLite Benchmarking. Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram The Dangers and Complexities of SQLite Benchmarking Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram 2 3 Benchmarking SQLite is Non-trivial! Benchmarking complex systems in a repeatable fashion

More information

Deep Learning Performance and Cost Evaluation

Deep Learning Performance and Cost Evaluation Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Don Wang, Rene Meyer, Ph.D. info@ AMAX Corporation Publish date: October 25,

More information

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16 Transaction Management Overview Chapter 16 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Transactions Concurrent execution of user programs is essential for good DBMS performance. Because

More information

RIGHTNOW A C E

RIGHTNOW A C E RIGHTNOW A C E 2 0 1 4 2014 Aras 1 A C E 2 0 1 4 Scalability Test Projects Understanding the results 2014 Aras Overview Original Use Case Scalability vs Performance Scale to? Scaling the Database Server

More information

A Better Storage Solution

A Better Storage Solution A Better Storage Solution Presented by: Richard Goss Presentation to The Problem with Hard Disks Processors have increased in speed by orders of magnitude over the years. But spinning hard disk drives

More information

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery In This Lecture Database Systems Lecture 15 Natasha Alechina Transactions Recovery System and Media s Concurrency Concurrency problems For more information Connolly and Begg chapter 20 Ullmanand Widom8.6

More information

Progress Report on Transparent Checkpointing for Supercomputing

Progress Report on Transparent Checkpointing for Supercomputing Progress Report on Transparent Checkpointing for Supercomputing Jiajun Cao, Rohan Garg College of Computer and Information Science, Northeastern University {jiajun,rohgarg}@ccs.neu.edu August 21, 2015

More information

EMC Backup and Recovery for Microsoft SQL Server

EMC Backup and Recovery for Microsoft SQL Server EMC Backup and Recovery for Microsoft SQL Server Enabled by Microsoft SQL Native Backup Reference Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the information

More information

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine

More information

System Requirements. PREEvision. System requirements and deployment scenarios Version 7.0 English

System Requirements. PREEvision. System requirements and deployment scenarios Version 7.0 English System Requirements PREEvision System and deployment scenarios Version 7.0 English Imprint Vector Informatik GmbH Ingersheimer Straße 24 70499 Stuttgart, Germany Vector reserves the right to modify any

More information

Oracle Database 12c: JMS Sharded Queues

Oracle Database 12c: JMS Sharded Queues Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE

More information

Optimizing Server Flash with Intelligent Caching (Enterprise Storage Track)

Optimizing Server Flash with Intelligent Caching (Enterprise Storage Track) Optimizing Server Flash with Intelligent Caching (Enterprise Storage Track) Mohit Bhatnagar Sr. Director Flash Products and Solutions Santa Clara, CA 1 Definition of Intelligent Source: Dictionary.com

More information

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson A Cross Media File System Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson 1 Let s build a fast server NoSQL store, Database, File server, Mail server Requirements

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information