Cloud-related Storage Research in Santa Cruz

Size: px
Start display at page:

Download "Cloud-related Storage Research in Santa Cruz"

Transcription

1 Cloud-related Storage Research in Santa Cruz Darrell Long University of California, Santa Cruz

2 Trading Storage for Computation (and vice versa) 2

3 Trade Storage for Computation Inputs rocess Result Storing rarely used intermediate or final results is wasteful But, if results are still used, even infrequently, they may not be discarded In the cloud, computing a result is often cheaper than storing it store the inputs and process, recompute on demand Challenges: Cost Calculation: when is re-computation is cheaper than storage? rovenance: what must be known in order to reproduce a result? 3

4 Old Idea, New otential Storage Recomputation Balancing storage and computation is common Data de-duplication, file compression, dynamic programming... The cloud allows opportunities for large scale tradeoffs Quickly allocate resources to compute on demand No need for over provisioning to prepare for rare events purchase on demand computation when needed 4

5 Example Archive of 100,000 photos rovide bmp, jpeg, tiff, Adobe, png Use Amazon Web Services AWS rices- June S3 Storage Cost $0.18 per GB/Month S3 Data In $0.03 per GB Store Everything 1600x1200 images in 5 formats= 2.2TB 100 GB of requests Cs =$ per month S3 Data Out EC2 Sm. Machine Instance EC2 Data In EC2 Data Out $0.17 per GB $0.10 per hour $0.10 per GB $0.17 per GB Recompute Formats on Demand Raw BM=550 GB 720 on demand small linux instance hours 100 GB out from S3 to EC2+100 GB EC2 Out Cr =$ per month Cr = Cost of Re-computation Cs = Cost of Storage 5

6 Evolvable, Reliable, Energy-Efficient Disk-Based Archival Storage 6

7 Existing approaches to archival storage Tape Disk Array MAID ergamum Media Costs low medium - high medium low Random Access erformance Centralized Controller poor good medium medium yes usually yes no ower Usage low high medium low Media per reader many one one one 7

8 ergamum tome Tome D0 D1 D2 Region with 3 Segments Region with 5 Segments Low-ower CU timestamp(d0) timestamp(d1) timestamp(d2) sig(d0) sig(d1) sig(d2) Network SATA Disk Segment arity NVRAM Low power CU & DRAM: NAS functions, consistency checking, parity ops SATA hard drive: low-cost, persistent storage Data stored in segments with parity appended NVRAM: metadata storage (data signatures, time stamps, index, etc.) Ethernet controller: commodity interconnection network 8

9 Two-level reliability approach Redundancy Group arity Region Redundancy Group R R R R R R R R R R R R R R R R R R Redundancy Group R R R R R R R R R R R R R R R R R R Redundancy Group R R R R R R Redundancy Group Tome 0 R R R R R R R R R R R R Tome 1 Tome 2 Tome 3 Intra-disk - protection from latent sector errors Increased reliability reduces scrubbing needs Fix errors with as little spin-up as possible Inter-disk - protection from device loss 9

10 Secure, Long-Term Storage Without Encryption 10

11 Keeping Secrets by Splitting Them Creates n shares, m required for recovery n out of n schemes (e.g. XOR based) m out of n threshold schemes (e.g. Shamirʼs scheme) Benefits Fewer than m shares reveals no information Unconditionally secure Reconstruction not dependent on a single share Well suited to the group trust model R 1... R 2 = R n-1 S S 11

12 OTSHARDS Overview User Object Fragment Fragment Fragment Shard Shard Shard Shard X Shard Y arity Shard Archive 0 Archive 1 Archive 2 Archive 3 Redundancy Group 12

13 Two-Level Split Security Adversary needs to compromise a lot of random data with two levels of splitting With 50% of the shards, only 0.04% of data was revealed with a 3-way XOR split and 3 of 4 Shamir split % of Secrets Recovered of3 2of4 3of4 % of Secrets Recovered of3 2of4 3of % of Total Secret Shares One level: Shamir split % of Total Shares Two levels: 3-way XOR, Shamir split 13

14 OTSHARDS User Model Users access data with a OTSHARDS client Local installation Shared installation OTSHARDS client transforms data Archives never see clear data Archives manage redundancy themselves OTSHARDS Client OTSHARDS Client A B C Redundancy Data 14

15 Store, Forget & Check 15

16 Whatʼs the problem? Systems store data on remote nodes Remote nodes may not be trustworthy Data owner must check to ensure that data is really stored Two current approaches: Read data from multiple sites and check for consistency Generate checksum remotely and compare to checksum of local data We developed an efficient algorithm that does not require Keeping a local copy of the data Keeping a set of questions and answers 16

17 Cloud storage: back-up Clouds can be racks of servers or peer-to-peer articipants in the scheme offer limited storage on their machine in exchange for storing their own data Data protected using parity or redundancy Extra blocks calculated using m/n redundancy codes Generate n blocks Require any m of the blocks to rebuild the data Many known mechanisms for m/n codes Linear interpolation XOR and Galois field-based articipants need to be able to verify that other nodes are doing their part... 17

18 Algebraic signatures Algebraic properties Assume that X and Y are large data objects: sig(x Y) = sig(x) sig(y) sig(β X) = β sig(x) Multiplication is in the Galois field of the signature calculation Signatures and parity formation commute Signatures can be updated from the old signature and the signature of the delta (XOR) between old and new data Signature calculation is fast! Hundreds of megabytes per second on a modern CU Speed limited by disk bandwidth 18

19 Our algorithm Store data across distributed system Challenge sites to prove that they hold the data Sites respond with the signatures of requested data Sites reveal tiny amount of information: size of signature Challenger verifies that the signatures are consistent D 1 D 2 D 3 sig1 sig2 sig3 sigp Calculate signature of 32 byte ranges at 4+i 71, i = 5,,20 sig 1 sig 2 sig 3 sig 19

Pergamum Replacing Tape with Energy Efficient, Reliable, Disk- Based Archival Storage

Pergamum Replacing Tape with Energy Efficient, Reliable, Disk- Based Archival Storage ergamum Replacing Tape with Energy Efficient, Reliable, Disk- Based Archival Storage Mark W. Storer Kevin M. Greenan Ethan L. Miller Kaladhar Voruganti* University of California, Santa Cruz *Network Appliance

More information

Store, Forget & Check: Using Algebraic Signatures to Check Remotely Administered Storage

Store, Forget & Check: Using Algebraic Signatures to Check Remotely Administered Storage Store, Forget & Check: Using Algebraic Signatures to Check Remotely Administered Storage Ethan L. Miller & Thomas J. E. Schwarz Storage Systems Research Center University of California, Santa Cruz What

More information

Where d My Photos Go? Challenges in Preserving Digital Data for the Long Term

Where d My Photos Go? Challenges in Preserving Digital Data for the Long Term Where d My hotos Go? Challenges in reserving Digital Data for the Long Term Ethan L. Miller Storage Systems Research Center Center for Research in Intelligent Storage University of California, Santa Cruz

More information

Overview of the Storage Systems Research Center Darrell Long & Ethan Miller Jack Baskin School of Engineering

Overview of the Storage Systems Research Center Darrell Long & Ethan Miller Jack Baskin School of Engineering Overview of the Storage Systems Research Center Darrell Long & Ethan Miller Jack Baskin School of Engineering The SSRC in one slide Research Challenges Exascale capacity & scalability erformance Security

More information

Physical Representation of Files

Physical Representation of Files Physical Representation of Files A disk drive consists of a disk pack containing one or more platters stacked like phonograph records. Information is stored on both sides of the platter. Each platter is

More information

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO Agenda Technical challenge Custom product Growth of aspirations Enterprise requirements Making an enterprise cold storage product 2 Technical Challenge

More information

Physical Storage Media

Physical Storage Media Physical Storage Media These slides are a modified version of the slides of the book Database System Concepts, 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media Storage and File Structure Classification of Physical Storage Media Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Storage Systems. Storage Systems

Storage Systems. Storage Systems Storage Systems Storage Systems We already know about four levels of storage: Registers Cache Memory Disk But we've been a little vague on how these devices are interconnected In this unit, we study Input/output

More information

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University CS 370: SYSTEM ARCHITECTURE & SOFTWARE [MASS STORAGE] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L29.1 L29.2 Topics covered

More information

Ch 11: Storage and File Structure

Ch 11: Storage and File Structure Ch 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary Dictionary Storage

More information

Tape pictures. CSE 30341: Operating Systems Principles

Tape pictures. CSE 30341: Operating Systems Principles Tape pictures 4/11/07 CSE 30341: Operating Systems Principles page 1 Tape Drives The basic operations for a tape drive differ from those of a disk drive. locate positions the tape to a specific logical

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

MASS-STORAGE STRUCTURE

MASS-STORAGE STRUCTURE UNIT IV MASS-STORAGE STRUCTURE Mass-Storage Systems ndescribe the physical structure of secondary and tertiary storage devices and the resulting effects on the uses of the devicesnexplain the performance

More information

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY WHITEPAPER DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY A Detailed Review ABSTRACT No single mechanism is sufficient to ensure data integrity in a storage system.

More information

File systems CS 241. May 2, University of Illinois

File systems CS 241. May 2, University of Illinois File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Google File System. Arun Sundaram Operating Systems

Google File System. Arun Sundaram Operating Systems Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)

More information

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama Basics of Cloud Computing Lecture 2 Cloud Providers Satish Srirama Outline Cloud computing services recap Amazon cloud services Elastic Compute Cloud (EC2) Storage services - Amazon S3 and EBS Cloud managers

More information

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition, Chapter 12: Mass-Storage Systems, Silberschatz, Galvin and Gagne 2009 Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management

More information

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage) The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) Enterprise storage: $30B market built on disk Key players: EMC, NetApp, HP, etc.

More information

File Shredders. and, just what is a file?

File Shredders. and, just what is a file? File Shredders. File shredders delete a file but they do that in a way that is different from how the Windows operating system (and all regular Windows applications) delete files. To understand the difference,

More information

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

! Design constraints.  Component failures are the norm.  Files are huge by traditional standards. ! POSIX-like Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total

More information

Staggeringly Large Filesystems

Staggeringly Large Filesystems Staggeringly Large Filesystems Evan Danaher CS 6410 - October 27, 2009 Outline 1 Large Filesystems 2 GFS 3 Pond Outline 1 Large Filesystems 2 GFS 3 Pond Internet Scale Web 2.0 GFS Thousands of machines

More information

Cumulus: Filesystem Backup to the Cloud

Cumulus: Filesystem Backup to the Cloud Cumulus: Filesystem Backup to the Cloud 7th USENIX Conference on File and Storage Technologies (FAST 09) Michael Vrable Stefan Savage Geoffrey M. Voelker University of California, San Diego February 26,

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 24 Mass Storage, HDFS/Hadoop Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ What 2

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

Software-defined Storage: Fast, Safe and Efficient

Software-defined Storage: Fast, Safe and Efficient Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about

More information

COMP091 Operating Systems 1. File Systems

COMP091 Operating Systems 1. File Systems COMP091 Operating Systems 1 File Systems Media File systems organize the storage space on persistent media such as disk, tape, CD/DVD/BD, USB etc. Disk, USB drives, and virtual drives are referred to as

More information

Midterm Exam #3 Solutions November 30, 2016 CS162 Operating Systems

Midterm Exam #3 Solutions November 30, 2016 CS162 Operating Systems University of California, Berkeley College of Engineering Computer Science Division EECS Fall 2016 Anthony D. Joseph Midterm Exam #3 Solutions November 30, 2016 CS162 Operating Systems Your Name: SID AND

More information

How to recover a failed Storage Spaces

How to recover a failed Storage Spaces www.storage-spaces-recovery.com How to recover a failed Storage Spaces ReclaiMe Storage Spaces Recovery User Manual 2013 www.storage-spaces-recovery.com Contents Overview... 4 Storage Spaces concepts and

More information

THE ZADARA CLOUD. An overview of the Zadara Storage Cloud and VPSA Storage Array technology WHITE PAPER

THE ZADARA CLOUD. An overview of the Zadara Storage Cloud and VPSA Storage Array technology WHITE PAPER WHITE PAPER THE ZADARA CLOUD An overview of the Zadara Storage Cloud and VPSA Storage Array technology Zadara 6 Venture, Suite 140, Irvine, CA 92618, USA www.zadarastorage.com EXECUTIVE SUMMARY The IT

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ion Stoica, UC Berkeley September 13, 2016 (based on presentation from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk N

More information

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

Reliable and Efficient Metadata Storage and Indexing Using NVRAM

Reliable and Efficient Metadata Storage and Indexing Using NVRAM Reliable and Efficient Metadata Storage and Indexing Using NVRAM Ethan L. Miller Kevin Greenan Andrew Leung Darrell Long Avani Wildani (and others) Storage Systems Research Center University of California,

More information

Zmanda Cloud Backup FAQ

Zmanda Cloud Backup FAQ Zmanda Cloud Backup 2.0.1 FAQ The first sections of this document cover general questions regarding features, cloud, and support; the last section lists error messages and what to do about them. Terminology

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved. See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor

More information

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

Oracle Zero Data Loss Recovery Appliance (ZDLRA) Oracle Zero Data Loss Recovery Appliance (ZDLRA) Overview Attila Mester Principal Sales Consultant Data Protection Copyright 2015, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement

More information

V. Mass Storage Systems

V. Mass Storage Systems TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides

More information

CS5460: Operating Systems Lecture 20: File System Reliability

CS5460: Operating Systems Lecture 20: File System Reliability CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 9: Mass Storage Structure Prof. Alan Mislove (amislove@ccs.neu.edu) Moving-head Disk Mechanism 2 Overview of Mass Storage Structure Magnetic

More information

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama Basics of Cloud Computing Lecture 2 Cloud Providers Satish Srirama Outline Cloud computing services recap Amazon cloud services Elastic Compute Cloud (EC2) Storage services - Amazon S3 and EBS Cloud managers

More information

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CMSC 424 Database design Lecture 12 Storage. Mihai Pop CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number

More information

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer & Provisioning with SUSE Enterprise Storage Nyers Gábor Trainer & Consultant @Trebut gnyers@trebut.com Managing storage growth and costs of the software-defined datacenter PRESENT Easily scale and manage

More information

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg Zero Data Loss Recovery Appliance Frank Schneede, Sebastian Solbach Systemberater, BU Database, Oracle Deutschland B.V. & Co. KG Safe Harbor Statement The following is intended to outline our general product

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

Data Centers. Tom Anderson

Data Centers. Tom Anderson Data Centers Tom Anderson Transport Clarification RPC messages can be arbitrary size Ex: ok to send a tree or a hash table Can require more than one packet sent/received We assume messages can be dropped,

More information

Rio-2 Hybrid Backup Server

Rio-2 Hybrid Backup Server A Revolution in Data Storage for Today s Enterprise March 2018 Notices This white paper provides information about the as of the date of issue of the white paper. Processes and general practices are subject

More information

Today: Secondary Storage! Typical Disk Parameters!

Today: Secondary Storage! Typical Disk Parameters! Today: Secondary Storage! To read or write a disk block: Seek: (latency) position head over a track/cylinder. The seek time depends on how fast the hardware moves the arm. Rotational delay: (latency) time

More information

IBM EXAM QUESTIONS & ANSWERS

IBM EXAM QUESTIONS & ANSWERS IBM 000-452 EXAM QUESTIONS & ANSWERS Number: 000-452 Passing Score: 800 Time Limit: 120 min File Version: 68.8 http://www.gratisexam.com/ IBM 000-452 EXAM QUESTIONS & ANSWERS Exam Name: IBM Storwize V7000

More information

FAST SQL SERVER BACKUP AND RESTORE

FAST SQL SERVER BACKUP AND RESTORE WHITE PAPER FAST SQL SERVER BACKUP AND RESTORE WITH PURE STORAGE TABLE OF CONTENTS EXECUTIVE OVERVIEW... 3 GOALS AND OBJECTIVES... 3 AUDIENCE... 3 PURE STORAGE INTRODUCTION... 4 SOLUTION SUMMARY... 4 FLASHBLADE

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 10: Mass-Storage Systems Zhi Wang Florida State University Content Overview of Mass Storage Structure Disk Structure Disk Scheduling Disk

More information

Chapter 10: Storage and File Structure

Chapter 10: Storage and File Structure Chapter 10: Storage and File Structure Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic

More information

ZYNSTRA TECHNICAL BRIEFING NOTE

ZYNSTRA TECHNICAL BRIEFING NOTE ZYNSTRA TECHNICAL BRIEFING NOTE Backup What is Backup? Backup is a service that forms an integral part of each Cloud Managed Server. Its purpose is to regularly store an additional copy of your data and

More information

Pelican: A building block for exascale cold data storage

Pelican: A building block for exascale cold data storage Pelican: A building block for exascale cold data storage Shobana Balarishnan, Richard Black, Austin Donnelly, Paul England, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson, Antony

More information

CS 345A Data Mining. MapReduce

CS 345A Data Mining. MapReduce CS 345A Data Mining MapReduce Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very large Tens to hundreds of terabytes

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Chapter 14: Mass-Storage Systems

Chapter 14: Mass-Storage Systems Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Storage and File Structure

Storage and File Structure Storage and File Structure 1 Roadmap of This Lecture Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

Arcserve UDP 7000 Appliance Series Revolution, Not Just Evolution

Arcserve UDP 7000 Appliance Series Revolution, Not Just Evolution Arcserve 7000 Appliance Series Revolution, Not Just Evolution For customers who require a set and forget backup and recovery solution, Arcserve Unified Data Protection () 7000 is the first complete and

More information

Storage S3 in backup. When? Value Architecture.

Storage S3 in backup. When? Value Architecture. Storage S3 in backup When? Value Architecture Daniel.Olkowski@dell.com Agenda Storage S3 Storage S3 in backup Where to use Where not to use Use cases Prices 2 of Y S3 storage as backup media / Storage

More information

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni Federated Array of Bricks Y Saito et al HP Labs CS 6464 Presented by Avinash Kulkarni Agenda Motivation Current Approaches FAB Design Protocols, Implementation, Optimizations Evaluation SSDs in enterprise

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006 November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds

More information

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based

More information

PC-based data acquisition II

PC-based data acquisition II FYS3240 PC-based instrumentation and microcontrollers PC-based data acquisition II Data streaming to a storage device Spring 2015 Lecture 9 Bekkeng, 29.1.2015 Data streaming Data written to or read from

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

AWS Storage Gateway. Not your father s hybrid storage. University of Arizona IT Summit October 23, Jay Vagalatos, AWS Solutions Architect

AWS Storage Gateway. Not your father s hybrid storage. University of Arizona IT Summit October 23, Jay Vagalatos, AWS Solutions Architect AWS Storage Gateway Not your father s hybrid storage University of Arizona IT Summit 2017 Jay Vagalatos, AWS Solutions Architect October 23, 2017 The AWS Storage Portfolio Amazon EBS (persistent) Block

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices

More information

Data management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology

Data management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology Data management is fun Casey Dunn Assistant Professor Ecology and Evolutionary Biology What is science? The study of the natural world through observation and experiment. Reproducible study. Prove it isn

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part 2 - Disk Array and RAID

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part 2 - Disk Array and RAID Distributed Video ystems Chapter 5 Issues in Video torage and Retrieval art 2 - Disk Array and RAID Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 5.1

More information

CS-580K/480K Advanced Topics in Cloud Computing. Storage Virtualization

CS-580K/480K Advanced Topics in Cloud Computing. Storage Virtualization CS-580K/480K dvanced Topics in Cloud Computing Storage Virtualization 1 Where we are 2 Virtualization Layer Operating System 1 2 3 4 Operating System 1 2 3 4 Operating System 1 2 3 4 VM1 VM2 VM3 Virtualization

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

UNIT IV -- TRANSPORT LAYER

UNIT IV -- TRANSPORT LAYER UNIT IV -- TRANSPORT LAYER TABLE OF CONTENTS 4.1. Transport layer. 02 4.2. Reliable delivery service. 03 4.3. Congestion control. 05 4.4. Connection establishment.. 07 4.5. Flow control 09 4.6. Transmission

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

Offloaded Data Transfers (ODX) Virtual Fibre Channel for Hyper-V. Application storage support through SMB 3.0. Storage Spaces

Offloaded Data Transfers (ODX) Virtual Fibre Channel for Hyper-V. Application storage support through SMB 3.0. Storage Spaces 2 ALWAYS ON, ENTERPRISE-CLASS FEATURES ON LESS EXPENSIVE HARDWARE ALWAYS UP SERVICES IMPROVED PERFORMANCE AND MORE CHOICE THROUGH INDUSTRY INNOVATION Storage Spaces Application storage support through

More information

Index construction CE-324: Modern Information Retrieval Sharif University of Technology

Index construction CE-324: Modern Information Retrieval Sharif University of Technology Index construction CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch.

More information

Remote Data Checking: Auditing the Preservation Status of Massive Data Sets on Untrusted Store

Remote Data Checking: Auditing the Preservation Status of Massive Data Sets on Untrusted Store Remote Data Checking: Auditing the Preservation Status of Massive Data Sets on Untrusted Store Randal Burns randal@cs.jhu.edu www.cs.jhu.edu/~randal/ Department of Computer Science, Johns Hopkins Univers

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

Trading Capacity for Data Protection

Trading Capacity for Data Protection Trading Capacity for Data Protection A Guide to Capacity Overhead on the StoreVault S500 How capacity is calculated What to expect Benefits of redundancy Introduction Drive-Level Capacity Losses Bytes

More information

Modernize Your Backup and DR Using Actifio in AWS

Modernize Your Backup and DR Using Actifio in AWS FOR AWS Modernize Your Backup and DR Using Actifio in AWS 150105H FOR AWS Modernize Your Backup and DR Using Actifio in AWS What is Actifio? Actifio virtualizes the data that s the lifeblood of business.

More information

Lecture 15 - Chapter 10 Storage and File Structure

Lecture 15 - Chapter 10 Storage and File Structure CMSC 461, Database Management Systems Spring 2018 Lecture 15 - Chapter 10 Storage and File Structure These slides are based on Database System Concepts 6th edition book (whereas some quotes and figures

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

Pass4test Certification IT garanti, The Easy Way!

Pass4test Certification IT garanti, The Easy Way! Pass4test Certification IT garanti, The Easy Way! http://www.pass4test.fr Service de mise à jour gratuit pendant un an Exam : SOA-C01 Title : AWS Certified SysOps Administrator - Associate Vendor : Amazon

More information

Index construction CE-324: Modern Information Retrieval Sharif University of Technology

Index construction CE-324: Modern Information Retrieval Sharif University of Technology Index construction CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford) Ch.

More information

Introduction. Collecting, Searching and Sorting evidence. File Storage

Introduction. Collecting, Searching and Sorting evidence. File Storage Collecting, Searching and Sorting evidence Introduction Recovering data is the first step in analyzing an investigation s data Recent studies: big volume of data Each suspect in a criminal case: 5 hard

More information

RAID. Redundant Array of Inexpensive Disks. Industry tends to use Independent Disks

RAID. Redundant Array of Inexpensive Disks. Industry tends to use Independent Disks RAID Chapter 5 1 RAID Redundant Array of Inexpensive Disks Industry tends to use Independent Disks Idea: Use multiple disks to parallelise Disk I/O for better performance Use multiple redundant disks for

More information

Dept. Of Computer Science, Colorado State University

Dept. Of Computer Science, Colorado State University CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [HADOOP/HDFS] Trying to have your cake and eat it too Each phase pines for tasks with locality and their numbers on a tether Alas within a phase, you get one,

More information

WHICH HARD DRIVE IS RIGHT FOR YOU?

WHICH HARD DRIVE IS RIGHT FOR YOU? WHICH HARD DRIVE IS RIGHT FOR YOU? Choosing the right desktop drive can be a challenge. Not only is it difficult to tell one small grey box apart from all the other small grey boxes, but there are so many

More information

Tape Sucks for Long-Term Retention Time to Move to the Cloud. How Cloud is Transforming Legacy Data Strategies

Tape Sucks for Long-Term Retention Time to Move to the Cloud. How Cloud is Transforming Legacy Data Strategies Tape Sucks for Long-Term Retention Time to Move to the Cloud How Cloud is Transforming Legacy Data Strategies INTRODUCTION Tapes suck for long term retention (LTR) Unknown content Locked in proprietary

More information

Google is Really Different.

Google is Really Different. COMP 790-088 -- Distributed File Systems Google File System 7 Google is Really Different. Huge Datacenters in 5+ Worldwide Locations Datacenters house multiple server clusters Coming soon to Lenior, NC

More information

User manual. Cloudevo

User manual. Cloudevo User manual Cloudevo Table of contents Table of contents... p. 2 Introduction... p. 3 Use cases... p. 3 Setup... p. 3 Create account... p. 3 User interface... p. 4 Cache... p. 5 Cloud storage services...

More information

Introduction to carving File fragmentation Object validation Carving methods Conclusion

Introduction to carving File fragmentation Object validation Carving methods Conclusion Simson L. Garfinkel Presented by Jevin Sweval Introduction to carving File fragmentation Object validation Carving methods Conclusion 1 Carving is the recovery of files from a raw dump of a storage device

More information

Welcome to the New Era of Cloud Computing

Welcome to the New Era of Cloud Computing Welcome to the New Era of Cloud Computing Aaron Kimball The web is replacing the desktop 1 SDKs & toolkits are there What about the backend? Image: Wikipedia user Calyponte 2 Two key concepts Processing

More information