Knockoff: Cheap versions in the cloud. Xianzheng Dou, Peter M. Chen, Jason Flinn

Size: px
Start display at page:

Download "Knockoff: Cheap versions in the cloud. Xianzheng Dou, Peter M. Chen, Jason Flinn"

Transcription

1 Knockoff: Cheap versions in the cloud Xianzheng Dou, Peter M. Chen, Jason Flinn

2 Cloud-based storage Google Drive Dropbox Pros: Ease-of-management Reliability Microsoft OneDrive Xianzheng Dou 1

3 Cloud-based storage Google Drive Dropbox Challenges: Storage costs Communication costs Microsoft OneDrive Xianzheng Dou 2

4 Versioning increases costs Google Drive Dropbox Pros: Recovery of lost data Auditing Troubleshooting Versioning Microsoft OneDrive Xianzheng Dou 3

5 Reducing costs: a new direction Established methods exploit similarities in data Chunk-based deduplication Delta compression Greater work for incremental gains Our goal: explore an orthogonal new dimension Deterministically recompute data in lieu of communication, storage Xianzheng Dou 4

6 File: data or computation? Computation File data Xianzheng Dou 5

7 File: data or computation? Computation File data Xianzheng Dou 5

8 File: data or computation? Computation File data Xianzheng Dou 6

9 File: data or computation? Computation File data Xianzheng Dou 6

10 File: data or computation? Computation File data Xianzheng Dou 6

11 File: data or computation? Computation File data Xianzheng Dou 6

12 File: data or computation? Computation File data How can we address non-determinism? Different output data Xianzheng Dou 6

13 File: data or computation? Deterministic record and replay How to use X-ray RECORD Record Log PLAY Latency Metric Mail Scope Request Logs of nondeterminism debug_peer_list List of root causes: Xianzheng mydomain.com 1) Dou 7 queue_directory 2)

14 File: data or computation? Deterministic record and replay How to use X-ray RECORD Record Log PLAY Latency Metric Mail Scope Request Logs of nondeterminism debug_peer_list List of root causes: Xianzheng mydomain.com 1) Dou 7 queue_directory 2)

15 File: data or computation? Deterministic record and replay How to use X-ray RECORD Record Log PLAY Latency Metric Mail Scope Request Logs of nondeterminism debug_peer_list List of root causes: Xianzheng mydomain.com 1) Dou 7 queue_directory 2)

16 File: data or computation? Deterministic record and replay RECORD How to use X-ray How to use X-ray Record Log RECORD PLAY Log PLAY Replay Latency Metric Mail Scope Request Logs of nondeterminism Latency Metric Mail Scope Request debug_peer_list List of root causes: Xianzheng mydomain.com Dou debug_peer_list List of root causes: 1) 7 queue_directory 2) mydomain.com 1)

17 Knockoff Selectively substitutes computation for data Benefits Reduction compared to chunk-based deduplication Communication costs: 21% Storage costs: 19% Benefits increases as we retain versions more frequently A new fined-grained versioning policy Xianzheng Dou 8

18 Outline Introduction Writing files Storing files Evaluation Xianzheng Dou 9

19 Knockoff Knockoff selectively represents a file as: Normal file data (by value) Logs of the nondeterminism needed to recompute the file (by operation) File Xianzheng Dou 10

20 Knockoff Knockoff selectively represents a file as: Normal file data (by value) Logs of the nondeterminism needed to recompute the file (by operation) File Xianzheng Dou 10

21 Knockoff Knockoff selectively represents a file as: Normal file data (by value) Logs of the nondeterminism needed to recompute the file (by operation) OR File Xianzheng Dou 10

22 Knockoff Knockoff selectively represents a file as: Normal file data (by value) Logs of the nondeterminism needed to recompute the file (by operation) OR File OR Xianzheng Dou 10

23 An example log for compilation Log entry Values 1 open rc=3 2 mmap rc=<addr>,file=<id,version> 3 gettimeofday rc=0,time=<time> 4 pthread_lock rc=0 5 SIGCHILD Xianzheng Dou 11

24 An example log for compilation Log entry Values 1 open rc=3 2 mmap rc=<addr>,file=<id,version> Return values from syscalls 3 gettimeofday rc=0,time=<time> 4 pthread_lock rc=0 Ordering of thread synchronization 5 SIGCHILD Signals Xianzheng Dou 11

25 An example log for compilation Log entry Values 1 open rc=3 2 mmap rc=<addr>,file=<id,version> 3 gettimeofday rc=0,time=<time> 4 pthread_lock rc=0 5 SIGCHILD Xianzheng Dou 11

26 An example log for compilation Log entry Values 1 open rc=3 2 mmap rc=<addr>,file=<id,version> 3 gettimeofday rc=0,time=<time> 4 pthread_lock rc=0 5 SIGCHILD Xianzheng Dou 11

27 Writing files By value By operation Xianzheng Dou 13

28 Writing files By value By operation Xianzheng Dou 13

29 Writing files By value By operation Xianzheng Dou 13

30 Writing files By value photo editing By operation Xianzheng Dou 14

31 Writing files By value cryptographic key generation By operation Xianzheng Dou 15

32 Outline Introduction Writing files Storing files Evaluation Xianzheng Dou 17

33 Storing files Store files by value or by operation?? A tradeoff between latency and costs Current versions: by value Past versions: by value or by operation Xianzheng Dou 18

34 Storing past versions Maximum materialization delay Time bound for reconstructing any version Regeneration time = 20s < Materialization delay = 60s Xianzheng Dou 19

35 Storing past versions Maximum materialization delay Time bound for reconstructing any version Regeneration time = 20s < Materialization delay = 60s Xianzheng Dou 19

36 Storing past versions Maximum materialization delay Time bound for reconstructing any version Regeneration time = 100s > Materialization delay = 60s Xianzheng Dou 20

37 Storing past versions Maximum materialization delay Time bound for reconstructing any version Regeneration time = 100s > Materialization delay = 60s Xianzheng Dou 20

38 Storing past versions Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay 20s Total regeneration time = 20s < Materialization delay = 60s Xianzheng Dou 21

39 Storing past versions Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay 30s 20s Total regeneration time = 50s < Materialization delay = 60s Xianzheng Dou 22

40 Storing past versions 30s Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay 30s 20s Total regeneration time = 80s > Materialization delay = 60s Xianzheng Dou 23

41 Storing past versions 30s Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay 30s 20s Total regeneration time = 80s > Materialization delay = 60s Xianzheng Dou 24

42 Storing past versions 30s Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay 30s 20s Total regeneration time = 80s > Materialization delay = 60s Xianzheng Dou 24

43 Storing past versions Maximum materialization delay Time bound for reconstructing any version Longest path > materialization delay A greedy algorithm Materialization delay = 60s Xianzheng Dou 25

44 Storing past versions: versioning policies Frequency of versioning Xianzheng Dou 26

45 Storing past versions: versioning policies Frequency of versioning No versioning Version on close Version on write Eidetic versioning Xianzheng Dou 26

46 Storing past versions: versioning policies Frequency of versioning No versioning Version on close Version on write Eidetic versioning Memory-mapped files Any past transient state in memory? Xianzheng Dou 26

47 Optimization: log compression Chunk-based deduplication is effective for file data Executions of the same application have similar patterns Can it also be applied to computation (logs of nondeterminism)? Delta compression Xianzheng Dou 28

48 Optimization: log compression Problem: a smattering of values differ in each log Xianzheng Dou 29

49 Optimization: log compression Problem: a smattering of values differ in each log Delta compression: 42% reduction Xianzheng Dou 29

50 Outline Introduction Writing files Storing files Evaluation Xianzheng Dou 30

51 Evaluation How much does Knockoff reduce bandwidth usage? How much does Knockoff reduce storage costs? What is Knockoff s performance overhead? For more experimental results, please refer to our paper Xianzheng Dou 31

52 Experimental setup User study 8 participants performed several simple tasks in one hour 20-day study A single-user longitudinal study A variety of programs used Various Linux utilities, text editors and programming languages Xianzheng Dou 32

53 Bandwidth usage: user study Xianzheng Dou 33

54 Data sent to the server (MB) Bandwidth usage: user study Already achieve 80%-85% reduction No versioning Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 33

55 Data sent to the server (MB) Bandwidth usage: user study Already achieve 80%-85% reduction No versioning Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 33

56 Data sent to the server (MB) Bandwidth usage: user study % No versioning Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 33

57 Data sent to the server (MB) Bandwidth usage: user study % 25% 47% No versioning Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 33

58 Data sent to the server (MB) Bandwidth usage: user study % 25% 47% No versioning Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 33

59 Bandwidth savings(%) Variances across applications Linux utility Graph editing Libreoffice Programming Text editing Web browsing Total Version on close Xianzheng Dou 35

60 Bandwidth savings(%) Variances across applications Linux utility Graph editing Libreoffice Programming Text editing Web browsing Total Version on close Xianzheng Dou 35

61 Bandwidth savings(%) Variances across users Knockoff A B C D E F G Version on close Xianzheng Dou 36

62 Bandwidth savings(%) Variances across users Knockoff A B C D E F G Version on close Xianzheng Dou 36

63 Relative storage cost Relative storage costs for past versions Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 37

64 Relative storage cost Relative storage costs for past versions % 1 19% Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 37

65 Relative storage cost Relative storage costs for past versions % 1 19% Version on close Version on write Eidetic Chunk-based deduplication Knockoff Xianzheng Dou 37

66 Performance overheads 7-8% performance overheads Baseline No Versioning Version on close Version on write Eidetic 38

67 Conclusion A new dimension for reducing costs Selectively substitute computation for data A general-purpose system for deterministic recomputation Reduces storage and communication costs for existing versioning policies Enables eidetic versioning Xianzheng Dou 39

68 Thank you! Xianzheng Dou 40

69 Varying the materialization delay Xianzheng Dou 41

70 Monetary costs Xianzheng Dou 42

71 Workload characteristics Xianzheng Dou 43

Toward Eidetic Distributed File Systems. Xianzheng Dou, Jason Flinn, Peter M. Chen

Toward Eidetic Distributed File Systems. Xianzheng Dou, Jason Flinn, Peter M. Chen Toward Eidetic Distributed File Systems Xianzheng Dou, Jason Flinn, Peter M. Chen Rich file system features Modern file systems store more than just data Versioning: retention of past state Provenance-aware:

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Eidetic Systems. David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen, University of Michigan

Eidetic Systems. David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen, University of Michigan Eidetic Systems David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen, University of Michigan https://www.usenix.org/conference/osdi14/technical-sessions/presentation/devecsery This

More information

Eidetic Systems. David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen University of Michigan. Abstract.

Eidetic Systems. David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen University of Michigan. Abstract. Eidetic Systems David Devecsery, Michael Chow, Xianzheng Dou, Jason Flinn, and Peter M. Chen University of Michigan Abstract The vast majority of state produced by a typical computer is generated, consumed,

More information

Virtualization Technique For Replica Synchronization

Virtualization Technique For Replica Synchronization Virtualization Technique For Replica Synchronization By : Ashwin G.Sancheti Email:ashwin@cs.jhu.edu Instructor : Prof.Randal Burns Date : 19 th Feb 2008 Roadmap Motivation/Goals What is Virtualization?

More information

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, & Windsor Hsu Backup Recovery Systems Division EMC Corporation Introduction

More information

Two-Party Fine-Grained Assured Deletion of Outsourced Data in Cloud Systems

Two-Party Fine-Grained Assured Deletion of Outsourced Data in Cloud Systems Two-Party Fine-Grained Assured Deletion of Outsourced Data in Cloud Systems Authors: Zhen Mo, Yan Qiao, Shigang Chen Email: zmo, yqiao, sgchen@cise.ufl.edu Outline Background Security Concerns Previous

More information

EaSync: A Transparent File Synchronization Service across Multiple Machines

EaSync: A Transparent File Synchronization Service across Multiple Machines EaSync: A Transparent File Synchronization Service across Multiple Machines Huajian Mao 1,2, Hang Zhang 1,2, Xianqiang Bao 1,2, Nong Xiao 1,2, Weisong Shi 3, and Yutong Lu 1,2 1 State Key Laboratory of

More information

Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information

Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen *, Wen Xia, Fangting Huang, Qing

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google* 정학수, 최주영 1 Outline Introduction Design Overview System Interactions Master Operation Fault Tolerance and Diagnosis Conclusions

More information

FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance

FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance Yujuan Tan, Jian Wen, Zhichao Yan, Hong Jiang, Witawas Srisa-an, Baiping Wang, Hao Luo Outline Background and Motivation

More information

Cumulus: Filesystem Backup to the Cloud

Cumulus: Filesystem Backup to the Cloud Cumulus: Filesystem Backup to the Cloud 7th USENIX Conference on File and Storage Technologies (FAST 09) Michael Vrable Stefan Savage Geoffrey M. Voelker University of California, San Diego February 26,

More information

Schema-Agnostic Indexing with Azure Document DB

Schema-Agnostic Indexing with Azure Document DB Schema-Agnostic Indexing with Azure Document DB Introduction Azure DocumentDB is Microsoft s multi-tenant distributed database service for managing JSON documents at Internet scale Multi-tenancy is an

More information

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads Ran Xu (Purdue), Subrata Mitra (Adobe Research), Jason Rahman (Facebook), Peter Bai (Purdue),

More information

Towards Web-based Delta Synchronization for Cloud Storage Services

Towards Web-based Delta Synchronization for Cloud Storage Services Towards Web-based Delta Synchronization for Cloud Storage Services He Xiao and Zhenhua Li, Tsinghua University; Ennan Zhai, Yale University; Tianyin Xu, UIUC; Yang Li and Yunhao Liu, Tsinghua University;

More information

Resource-Efficient Replication and Migration of Virtual Machines

Resource-Efficient Replication and Migration of Virtual Machines Resource-Efficient Replication and Migration of Virtual Machines by Kai-Yuan Hou A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science

More information

Making Serverless Computing More Serverless

Making Serverless Computing More Serverless Making Serverless Computing More Serverless Zaid Al-Ali, Sepideh Goodarzy, Ethan Hunter, Sangtae Ha, Richard Han, Eric Keller University of Colorado Boulder Eric Rozner IBM Research * Views not representative

More information

PageVault: Securing Off-Chip Memory Using Page-Based Authen?ca?on. Blaise-Pascal Tine Sudhakar Yalamanchili

PageVault: Securing Off-Chip Memory Using Page-Based Authen?ca?on. Blaise-Pascal Tine Sudhakar Yalamanchili PageVault: Securing Off-Chip Memory Using Page-Based Authen?ca?on Blaise-Pascal Tine Sudhakar Yalamanchili Outline Background: Memory Security Motivation Proposed Solution Implementation Evaluation Conclusion

More information

Giza: Erasure Coding Objects across Global Data Centers

Giza: Erasure Coding Objects across Global Data Centers Giza: Erasure Coding Objects across Global Data Centers Yu Lin Chen*, Shuai Mu, Jinyang Li, Cheng Huang *, Jin li *, Aaron Ogus *, and Douglas Phillips* New York University, *Microsoft Corporation USENIX

More information

Website Acceleration with mod_pagespeed

Website Acceleration with mod_pagespeed Website Acceleration with mod_pagespeed Joshua Marantz Google June 15, 2011 @jmarantz www.modpagespeed.com 2011 Google, Inc. All rights reserved. Velocity 2011: Faster By Default 2 Velocity 2011: Faster

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

Design Tradeoffs for Data Deduplication Performance in Backup Workloads

Design Tradeoffs for Data Deduplication Performance in Backup Workloads Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia

More information

Deterministic Process Groups in

Deterministic Process Groups in Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x x := t + 1 t := x x := t +

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #12 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Last class Outline

More information

ViewBox. Integrating Local File Systems with Cloud Storage Services. Yupu Zhang +, Chris Dragga + *, Andrea Arpaci-Dusseau +, Remzi Arpaci-Dusseau +

ViewBox. Integrating Local File Systems with Cloud Storage Services. Yupu Zhang +, Chris Dragga + *, Andrea Arpaci-Dusseau +, Remzi Arpaci-Dusseau + ViewBox Integrating Local File Systems with Cloud Storage Services Yupu Zhang +, Chris Dragga + *, Andrea Arpaci-Dusseau +, Remzi Arpaci-Dusseau + + University of Wisconsin Madison *NetApp, Inc. 5/16/2014

More information

Parallelizing Cryptography. Gordon Werner Samantha Kenyon

Parallelizing Cryptography. Gordon Werner Samantha Kenyon Parallelizing Cryptography Gordon Werner Samantha Kenyon Outline Security requirements Cryptographic Primitives Block Cipher Parallelization of current Standards AES RSA Elliptic Curve Cryptographic Attacks

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

Brief introduction of SocketPro continuous SQL-stream sending and processing system (Part 1: SQLite)

Brief introduction of SocketPro continuous SQL-stream sending and processing system (Part 1: SQLite) Brief introduction of SocketPro continuous SQL-stream sending and processing system (Part 1: SQLite) Introduction Most of client server database systems only support synchronous communication between client

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School

More information

Backup everything to cloud / local storage. CloudBacko Pro. Essential steps to get started

Backup everything to cloud / local storage. CloudBacko Pro. Essential steps to get started CloudBacko Pro Essential steps to get started Last update: September 22, 2017 Index Step 1). Configure a new backup set, and trigger a backup manually Step 2). Configure other backup set settings Step

More information

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31 DASH COPY GUIDE Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31 DASH Copy Guide TABLE OF CONTENTS OVERVIEW GETTING STARTED ADVANCED BEST PRACTICES FAQ TROUBLESHOOTING DASH COPY PERFORMANCE TUNING

More information

Method-Level Phase Behavior in Java Workloads

Method-Level Phase Behavior in Java Workloads Method-Level Phase Behavior in Java Workloads Andy Georges, Dries Buytaert, Lieven Eeckhout and Koen De Bosschere Ghent University Presented by Bruno Dufour dufour@cs.rutgers.edu Rutgers University DCS

More information

Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance

Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Outline Introduction and Motivation Software-centric Fault Detection Process-Level Redundancy Experimental Results

More information

Understanding Storage I/O Behaviors of Mobile Applications. Louisiana State University Department of Computer Science and Engineering

Understanding Storage I/O Behaviors of Mobile Applications. Louisiana State University Department of Computer Science and Engineering Understanding Storage I/O Behaviors of Mobile Applications Jace Courville jcourv@csc.lsu.edu Feng Chen fchen@csc.lsu.edu Louisiana State University Department of Computer Science and Engineering The Rise

More information

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout CGAR: Strong Consistency without Synchronous Replication Seo Jin Park Advised by: John Ousterhout Improved update performance of storage systems with master-back replication Fast: updates complete before

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Performance testing results using Dell EMC Data Domain DD6300 and Data Domain Boost for Enterprise Applications July

More information

Fast Tridiagonal Solvers on GPU

Fast Tridiagonal Solvers on GPU Fast Tridiagonal Solvers on GPU Yao Zhang John Owens UC Davis Jonathan Cohen NVIDIA GPU Technology Conference 2009 Outline Introduction Algorithms Design algorithms for GPU architecture Performance Bottleneck-based

More information

Scalable Shared Memory Programing

Scalable Shared Memory Programing Scalable Shared Memory Programing Marc Snir www.parallel.illinois.edu What is (my definition of) Shared Memory Global name space (global references) Implicit data movement Caching: User gets good memory

More information

QR Decomposition on GPUs

QR Decomposition on GPUs QR Decomposition QR Algorithms Block Householder QR Andrew Kerr* 1 Dan Campbell 1 Mark Richards 2 1 Georgia Tech Research Institute 2 School of Electrical and Computer Engineering Georgia Institute of

More information

Rhinoback Online Backup. In-File Delta

Rhinoback Online Backup. In-File Delta December 2006 Table of Content 1 Introduction... 3 1.1 Differential Delta Mode... 3 1.2 Incremental Delta Mode... 3 2 Delta Generation... 4 3 Block Size Setting... 4 4 During Backup... 5 5 During Restore...

More information

Adaptive Parallelism for Web Search

Adaptive Parallelism for Web Search Adaptive Parallelism for Web Search Myeongjae Jeon, Yuxiong He, Sameh Elnikety, Alan L. Cox, Scott Rixner Microsoft Research Rice University Redmond, WA, USA Houston, TX, USA Abstract A web search query

More information

Graphene-SGX. A Practical Library OS for Unmodified Applications on SGX. Chia-Che Tsai Donald E. Porter Mona Vij

Graphene-SGX. A Practical Library OS for Unmodified Applications on SGX. Chia-Che Tsai Donald E. Porter Mona Vij Graphene-SGX A Practical Library OS for Unmodified Applications on SGX Chia-Che Tsai Donald E. Porter Mona Vij Intel SGX: Trusted Execution on Untrusted Hosts Processing Sensitive Data (Ex: Medical Records)

More information

A Comparison of Capacity Management Schemes for Shared CMP Caches

A Comparison of Capacity Management Schemes for Shared CMP Caches A Comparison of Capacity Management Schemes for Shared CMP Caches Carole-Jean Wu and Margaret Martonosi Princeton University 7 th Annual WDDD 6/22/28 Motivation P P1 P1 Pn L1 L1 L1 L1 Last Level On-Chip

More information

Adaptive Android Kernel Live Patching

Adaptive Android Kernel Live Patching USENIX Security Symposium 2017 Adaptive Android Kernel Live Patching Yue Chen 1, Yulong Zhang 2, Zhi Wang 1, Liangzhao Xia 2, Chenfu Bao 2, Tao Wei 2 Florida State University 1 Baidu X-Lab 2 Android Kernel

More information

April 2010 Rosen Shingle Creek Resort Orlando, Florida

April 2010 Rosen Shingle Creek Resort Orlando, Florida Data Reduction and File Systems Jeffrey Tofano Chief Technical Officer, Quantum Corporation Today s Agenda File Systems and Data Reduction Overview File System and Data Reduction Integration Issues Reviewing

More information

BACKUP APP V7 CLOUUD FILE BACKUP & RESTORE GUIDE FOR WINDOWS

BACKUP APP V7 CLOUUD FILE BACKUP & RESTORE GUIDE FOR WINDOWS V7 CLOUUD FILE BACKUP & RESTORE GUIDE FOR WINDOWS Table of Contents 1 Overview... 1 1.1 About This Document... 7 2 Preparing for Backup and Restore... 8 2.1 Hardware Requirement... 8 2.2 Software Requirement...

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

Backup APP v7. Office 365 Exchange Online Backup & Restore Guide for Mac OS X

Backup APP v7. Office 365 Exchange Online Backup & Restore Guide for Mac OS X Backup APP v7 Office 365 Exchange Online Backup & Restore Guide for Mac OS X Revision History Date Descriptions Type of modification 5 Apr 2017 First Draft New Table of Contents 1 Overview... 1 About This

More information

UNIC: Secure Deduplication of General Computations. Yang Tang and Junfeng Yang Columbia University

UNIC: Secure Deduplication of General Computations. Yang Tang and Junfeng Yang Columbia University UNIC: Secure Deduplication of General Computations Yang Tang and Junfeng Yang Columbia University The world s data is fast exploding A significant portion of the data is redundant. Data deduplication can

More information

Cheetah: An Efficient Flat Addressing Scheme for Fast Query Services in Cloud Computing

Cheetah: An Efficient Flat Addressing Scheme for Fast Query Services in Cloud Computing IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications Cheetah: An Efficient Flat Addressing Scheme for Fast Query Services in Cloud Computing Yu Hua Wuhan National

More information

Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services

Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services Jacob R. Lorch Andrew Baumann Lisa Glendenning Dutch T. Meyer Andrew Warfield Our goal: Turn

More information

Purity: building fast, highly-available enterprise flash storage from commodity components

Purity: building fast, highly-available enterprise flash storage from commodity components Purity: building fast, highly-available enterprise flash storage from commodity components J. Colgrove, J. Davis, J. Hayes, E. Miller, C. Sandvig, R. Sears, A. Tamches, N. Vachharajani, and F. Wang 0 Gala

More information

EdgeCourier: An Edge-hosted Personal Service for Low-bandwidth Document Synchronization in Mobile Cloud Storage Services

EdgeCourier: An Edge-hosted Personal Service for Low-bandwidth Document Synchronization in Mobile Cloud Storage Services EdgeCourier: An Edge-hosted Personal Service for Low-bandwidth Document Synchronization in Mobile Cloud Storage Services Pengzhan Hao, Yongshu Bai, Xin Zhang, and Yifan Zhang Department of Computer Science

More information

Table of Contents. Introduction 3

Table of Contents. Introduction 3 1 Table of Contents Introduction 3 Data Protection Technologies 4 Btrfs File System Snapshot Technology How shared folders snapshot works Custom Scripting for Snapshot Retention Policy Self-Service Recovery

More information

A Low-bandwidth Network File System

A Low-bandwidth Network File System A Low-bandwidth Network File System Athicha Muthitacharoen, Benjie Chen MIT Lab for Computer Science David Mazières NYU Department of Computer Science Motivation Network file systems are a useful abstraction...

More information

Moving Databases to Oracle Cloud: Performance Best Practices

Moving Databases to Oracle Cloud: Performance Best Practices Moving Databases to Oracle Cloud: Performance Best Practices Kurt Engeleiter Product Manager Oracle Safe Harbor Statement The following is intended to outline our general product direction. It is intended

More information

Protecting Private Data in the Cloud: A Path Oblivious RAM Protocol

Protecting Private Data in the Cloud: A Path Oblivious RAM Protocol Protecting Private Data in the Cloud: A Path Oblivious RAM Protocol Nathan Wolfe and Ethan Zou Mentors: Ling Ren and Xiangyao Yu Fourth Annual MIT PRIMES Conference May 18, 2014 Outline 1. Background 2.

More information

I/O Buffering and Streaming

I/O Buffering and Streaming I/O Buffering and Streaming I/O Buffering and Caching I/O accesses are reads or writes (e.g., to files) Application access is arbitary (offset, len) Convert accesses to read/write of fixed-size blocks

More information

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research

More information

WaveScalar. Winter 2006 CSE WaveScalar 1

WaveScalar. Winter 2006 CSE WaveScalar 1 WaveScalar Dataflow machine good at exploiting ILP dataflow parallelism traditional coarser-grain parallelism cheap thread management memory ordering enforced through wave-ordered memory Winter 2006 CSE

More information

Efficient Resource Allocation And Avoid Traffic Redundancy Using Chunking And Indexing

Efficient Resource Allocation And Avoid Traffic Redundancy Using Chunking And Indexing Efficient Resource Allocation And Avoid Traffic Redundancy Using Chunking And Indexing S.V.Durggapriya 1, N.Keerthika 2 M.E, Department of CSE, Vandayar Engineering College, Pulavarnatham, Thanjavur, T.N,

More information

Making the Box Transparent: System Call Performance as a First-class Result. Yaoping Ruan, Vivek Pai Princeton University

Making the Box Transparent: System Call Performance as a First-class Result. Yaoping Ruan, Vivek Pai Princeton University Making the Box Transparent: System Call Performance as a First-class Result Yaoping Ruan, Vivek Pai Princeton University Outline n Motivation n Design & implementation n Case study n More results Motivation

More information

Lecture 2: Performance

Lecture 2: Performance Lecture 2: Performance Today s topics: Technology wrap-up Performance trends and equations Reminders: YouTube videos, canvas, and class webpage: http://www.cs.utah.edu/~rajeev/cs3810/ 1 Important Trends

More information

Parallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik

Parallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik Parallel Streaming Computation on Error-Prone Processors Yavuz Yetim, Margaret Martonosi, Sharad Malik Upsets/B muons/mb Average Number of Dopant Atoms Hardware Errors on the Rise Soft Errors Due to Cosmic

More information

End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product

End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product Luyang Wang, Pallab Bhattacharya, Yao-Min Chen, Shrinivas Joshi and James Cheng

More information

vsphere Replication 6.5 Technical Overview January 08, 2018

vsphere Replication 6.5 Technical Overview January 08, 2018 vsphere Replication 6.5 Technical Overview January 08, 2018 1 Table of Contents 1. VMware vsphere Replication 6.5 1.1.Introduction 1.2.Architecture Overview 1.3.Initial Deployment and Configuration 1.4.Replication

More information

HEAD HardwarE Accelerated Deduplication

HEAD HardwarE Accelerated Deduplication HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version

More information

Kismet: Parallel Speedup Estimates for Serial Programs

Kismet: Parallel Speedup Estimates for Serial Programs Kismet: Parallel Speedup Estimates for Serial Programs Donghwan Jeon, Saturnino Garcia, Chris Louie, and Michael Bedford Taylor Computer Science and Engineering University of California, San Diego 1 Questions

More information

Chimera: Hybrid Program Analysis for Determinism

Chimera: Hybrid Program Analysis for Determinism Chimera: Hybrid Program Analysis for Determinism Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanasamy University of Michigan, Ann Arbor - 1 - * Chimera image from http://superpunch.blogspot.com/2009/02/chimera-sketch.html

More information

Header Compression for TLV-based Packets

Header Compression for TLV-based Packets Header Compression for TLV-based Packets Marc Mosko Nov 5, 2015 Copyright Palo Alto Research Center, 2015 1 Header Compression in TLV World Compress all the signaling Fixed Header T and L fields V fields

More information

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018 MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018 Few words about Percona Monitoring and Management (PMM) 100% Free, Open Source

More information

A Lightweight OpenMP Runtime

A Lightweight OpenMP Runtime Alexandre Eichenberger - Kevin O Brien 6/26/ A Lightweight OpenMP Runtime -- OpenMP for Exascale Architectures -- T.J. Watson, IBM Research Goals Thread-rich computing environments are becoming more prevalent

More information

Nexis User Guide.

Nexis User Guide. Sign In Go to the global login page at Choose the language you prefer to use within the Nexis interface. Enter your Nexis user ID and password. Check the Remember Me box to save your password & ID for

More information

Towards Weakly Consistent Local Storage Systems

Towards Weakly Consistent Local Storage Systems Towards Weakly Consistent Local Storage Systems Ji-Yong Shin 1,2, Mahesh Balakrishnan 2, Tudor Marian 3, Jakub Szefer 2, Hakim Weatherspoon 1 1 Cornell University, 2 Yale University, 3 Google 2 Consistency/Performance

More information

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud Huijun Wu 1,4, Chen Wang 2, Yinjin Fu 3, Sherif Sakr 1, Liming Zhu 1,2 and Kai Lu 4 The University of New South

More information

PARLGRAN: Parallelism granularity selection for scheduling task chains on dynamically reconfigurable architectures *

PARLGRAN: Parallelism granularity selection for scheduling task chains on dynamically reconfigurable architectures * PARLGRAN: Parallelism granularity selection for scheduling task chains on dynamically reconfigurable architectures * Sudarshan Banerjee, Elaheh Bozorgzadeh, Nikil Dutt Center for Embedded Computer Systems

More information

harm nium Mahesh Balakrishnan MSR Helgi Kr. Sigurbjarnarson RU Ymir Vigfusson RU Pétur O. Ragnarsson RU

harm nium Mahesh Balakrishnan MSR Helgi Kr. Sigurbjarnarson RU Ymir Vigfusson RU Pétur O. Ragnarsson RU harm nium Helgi Kr. Sigurbjarnarson RU Pétur O. Ragnarsson RU Ymir Vigfusson RU Mahesh Balakrishnan MSR VM VM Host cannot reclaim space from virtual disk Hypervisor VM Virtual disk cannot expand beyond

More information

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008 Parallel Computing Using OpenMP/MPI Presented by - Jyotsna 29/01/2008 Serial Computing Serially solving a problem Parallel Computing Parallelly solving a problem Parallel Computer Memory Architecture Shared

More information

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle Dynamic Fine Grain Scheduling of Pipeline Parallelism Presented by: Ram Manohar Oruganti and Michael TeWinkle Overview Introduction Motivation Scheduling Approaches GRAMPS scheduling method Evaluation

More information

Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media. Vincent Pegoraro Ingo Wald Steven G. Parker

Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media. Vincent Pegoraro Ingo Wald Steven G. Parker Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media Vincent Pegoraro Ingo Wald Steven G. Parker Outline Introduction Related Work Monte Carlo Integration Radiative Energy Transfer SMC

More information

High Performance Computing on GPUs using NVIDIA CUDA

High Performance Computing on GPUs using NVIDIA CUDA High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and

More information

Shengyue Wang, Xiaoru Dai, Kiran S. Yellajyosula, Antonia Zhai, Pen-Chung Yew Department of Computer Science & Engineering University of Minnesota

Shengyue Wang, Xiaoru Dai, Kiran S. Yellajyosula, Antonia Zhai, Pen-Chung Yew Department of Computer Science & Engineering University of Minnesota Loop Selection for Thread-Level Speculation, Xiaoru Dai, Kiran S. Yellajyosula, Antonia Zhai, Pen-Chung Yew Department of Computer Science & Engineering University of Minnesota Chip Multiprocessors (CMPs)

More information

Weaving Relations for Cache Performance

Weaving Relations for Cache Performance VLDB 2001, Rome, Italy Best Paper Award Weaving Relations for Cache Performance Anastassia Ailamaki David J. DeWitt Mark D. Hill Marios Skounakis Presented by: Ippokratis Pandis Bottleneck in DBMSs Processor

More information

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel

More information

SFS: Random Write Considered Harmful in Solid State Drives

SFS: Random Write Considered Harmful in Solid State Drives SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea

More information

QoE Aware and Cell Capacity Enhanced Computation Offloading for Multi-Server Mobile Edge Computing Systems with Energy Harvesting Devices

QoE Aware and Cell Capacity Enhanced Computation Offloading for Multi-Server Mobile Edge Computing Systems with Energy Harvesting Devices QoE Aware and Cell Capacity Enhanced Computation Offloading for Multi-Server Mobile Edge Computing Systems with Energy Harvesting Devices Hai-Liang Zhao hliangzhao97@gmail.com Wuhan University of Technology

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Camdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa

Camdoop Exploiting In-network Aggregation for Big Data Applications Paolo Costa Camdoop Exploiting In-network Aggregation for Big Data Applications costa@imperial.ac.uk joint work with Austin Donnelly, Antony Rowstron, and Greg O Shea (MSR Cambridge) MapReduce Overview Input file

More information

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for

More information

Software-defined Storage: Fast, Safe and Efficient

Software-defined Storage: Fast, Safe and Efficient Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about

More information

What is Network Acceleration?

What is Network Acceleration? What is Network Acceleration? How do WAN Optimization, Network Acceleration, and Protocol Streamlining work, and what can they do for your network? Contents Introduction Availability Improvement Data Reduction

More information

Reducing The De-linearization of Data Placement to Improve Deduplication Performance

Reducing The De-linearization of Data Placement to Improve Deduplication Performance Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University

More information

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope

More information

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar ADVANCED DEDUPLICATION CONCEPTS Thomas Rivera, BlueArc Gene Nagle, Exar SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may

More information

Outline. Spanner Mo/va/on. Tom Anderson

Outline. Spanner Mo/va/on. Tom Anderson Spanner Mo/va/on Tom Anderson Outline Last week: Chubby: coordina/on service BigTable: scalable storage of structured data GFS: large- scale storage for bulk data Today/Friday: Lessons from GFS/BigTable

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

BEST PRACTICES FOR BACKUP

BEST PRACTICES FOR BACKUP BEST PRACTICES FOR BACKUP Presenters: Carlo Monasterial - (IT Specialist) from WiConnect Corp. Blair Brubacher New ECC member There are two kinds of people in the world those who have had a hard drive

More information

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University Outline Introduction and Motivation Our Design System and Implementation

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information