CS-580K/480K Advanced Topics in Cloud Computing. Object Storage
|
|
- Marcus Hood
- 5 years ago
- Views:
Transcription
1 CS-580K/480K Advanced Topics in Cloud Computing Object Storage 1
2 When we use object storage When we check Facebook, twitter Gmail Docs on DropBox Check share point Take pictures with Instagram 2
3 Object Storage is good for Unstructured data workloads Large capacity requirement (e.g., > 100s of Terabytes) Data archiving: documents, s and backups Storage for photos, videos, virtual machine images Need for granular security and multi-tenacy Need for automation, management, monitoring reporting tools Non-high performance 3
4 Object Usage Cases Object Storage Overview with architectural examples from Cloudian's 4
5 Block vs. Object Faster: For hot data Flash-optimized IOPS-centric VM optimized Block Bigger: For cool/cloud data Object-based Scale-out (multi-pb) Software-centric Object 5
6 Block vs. Object Block Data stored without any concept of data format or type The data is simply a series of 0s and 1s High-level applications or file systems to keep track of data location, context and meaning Object Object consists of an object identifier (OID), data and metadata No object organization system (flat organization) Direct access to individual objects, no need to traverse directories 6
7 How to build an object storage system Case 1: Swift 7
8 8
9 Swift: Storing & Retrieving data Flat namespace: accounts, containers and objects No nested directories Account: collection of containers List containers: GET /v1/accountname/ Create container: PUT /v1/accountname/containername/ Containers: collection of objects List objects: GET /v1/accountname/containername/ Upload object: PUT /v1/accountname/containername/objectname Retrieve object: GET /v1/accountname/containername/objectname 9
10 Basically 2 parts Proxy Server: Exposes the swift public (REST) API to users and stream to and from the client upon request Storage Nodes: Handle storage, replication, and management of objects, containers, and accounts. 10
11 Architecture Overview Proxy PUT /v1/account/container/object Rings Object Server Object Server Object Server Container Server Proxy Account Server Container Server Proxy Account Server Container Server Proxy Account Server Disks Disks Disks Disks Disks Disks 11
12 Proxy Server Shared-nothing architecture, can be scaled as needed Can place load balancer ahead of Proxy servers Objects are streamed between proxy server and client directly There is no cache in between Proxy Proxy Proxy 12
13 Object Server A very simple blob (i.e., binary large object) storage server that can store, retrieve and delete objects stored on local devices. Objects are stored as binary files on the filesystem Each object is stored using a path derived from the object name s hash and the operation s timestamp. Last write always wins, and ensures that the latest object version will be served. Object Server Obj File Proxy Obj File Obj File 13
14 Container Server The Container Server s primary job is to handle listings of objects. It doesn t know where those object s are, just what objects are in a specific container. The listings are stored as sqlite database files, and replicated across the cluster similar to how objects are. Statistics are also tracked that include the total number of objects, and total storage usage for that container. Container Server Proxy db1 db2 db3 14
15 Account Server The Account Server is very similar to the Container Server, excepting that it is responsible for listings of containers rather than objects. Account Server Proxy db1 db2 db3 15
16 The Rings The Rings: mapping data to physical locations in the cluster 3 rings to store 3 kind of things (accounts, containers and objects) Each ring works in the same way For a given account, container, or object name, the ring returns information on its physical location within storage nodes Device Look-up table: to find out which device contains the target object Device List: to find out which storage node this device belongs to Proxy PUT /v1/account/container/object Rings 16
17 Mapping using Basic Hash Functions MAPPING OF OBJECTS TO DIFFERENT DRIVES OBJECT Image 1 Image 2 Image 3 Music 1 Music 2 Music 3 Movie 1 Movie 2 HASH VALUE (HEXADECIMAL) b5e7d988cfdb78bc3be 1a9c221a8f f44dc87f6a169 73c79827a038c 1213f717f7f754f050d0 246fb7d6c43b 4b46f1381a53605fc0f 93a93d55bf8be ecb27b466c32a e55bcace dfec6b1544f4a d6e4d52964f59 69db47ace5f026310ab 170b02ac8bc58 c4abbd49974ba44c16 9c220dadbdac71 MAPPING VALUE DRIVE MAPPED TO hash(image 1) % 4 = 2 Drive 2 hash(image 2) % 4 = 3 Drive 3 hash(image 3) % 4 = 3 Drive 3 hash(music 1) % 4 = 1 Drive 1 hash(music 2) % 4 = 0 Drive 0 hash(music 3) % 4 = 0 Drive 0 hash(movie 1) % 4 = 2 Drive 2 hash(movie 2) % 4 = 1 Drive 1 Problem? The MD5 algorithm is a widely used hash function producing a 128-bit hash value. Although MD5 was initially designed to be used as a cryptographic hash function, it has been found to suffer from extensive vulnerabilities. 17
18 Problem? But what if we have to add/remove drives? The hash values of all objects will stay the same, but we need to recompute the mapping value for all objects, then re-map them to the different drives. 18
19 SWIFT -- Consistent Hashing Algorithm Consistent hashing algorithm achieves a similar goal but does things differently. Instead of getting the mapping value of each object, each drive will be assigned a range of hash values to store the objects. RANGE OF HASH VALUES FOR EACH DRIVE DRIVE Drive 0 Drive 1 Drive 2 Drive 3 RANGE OF HASH VALUES 0000 ~ 3fff 3fff ~ 7ffe 7fff ~ bffd bffd ~ efff 19
20 MAPPING OF OBJECTS TO DIFFERENT DRIVES OBJECT HASH VALUE (HEXADECIMAL) DRIVE MAPPED TO Image 1 b5e7d988cfdb78bc3be1a9c221a8f744 Drive 2 Image f44dc87f6a16973c79827a038c Drive 2 Image f717f7f754f050d0246fb7d6c43b Drive 0 Music 1 4b46f1381a53605fc0f93a93d55bf8be Drive 1 Music 2 ecb27b466c32a e55bcace257 Drive 3 Music dfec6b1544f4ad6e4d52964f59 Drive 1 Movie 1 69db47ace5f026310ab170b02ac8bc58 Drive 1 Movie 2 c4abbd49974ba44c169c220dadbdac71 Drive 3 20
21 With New Device Each drive will get a new range of hash values it is going to store. Each object s hash value will still remain the same. RANGE OF HASH VALUES FOR EACH DRIVE Any objects whose hash value is within range of its current drive will remain. For any other objects whose hash value is not within range of its current drive will be mapped to another drive But that number of objects is very few using consistent hashing algorithm, compared to the basic hash function. DRIVE Drive 0 Drive 1 Drive 2 Drive 3 RANGE OF HASH VALUES 0000 ~ 3fff 3fff ~ 7ffe 7fff ~ bffd bffd ~ ffff 21
22 With New Device 22
23 Problem? Each drive has a large range of hash values Multiple files may map to one (or several) drive Unbalance 23
24 Multiple Markers in Consistent Hashing Algorithm Instead of having one big hash range for each drive, multiple markers serve to split those large hash range into smaller chunks Multiple markers helps to evenly distribute the objects into drives, thus helping with the load balancing 24
25 In Summary: What is Ring doing? Evenly mapping data to physical locations in the cluster Build (re-build) Look-up table (from object hash value to device) Maintain device list (to identify the device location storage node) 25
26 26
27 Data durability Ensuring your data is still the same for ages Replicated or Erasure Coded? Depends on your use case Proxy returns data only if content matches stored checksum Continuously running background processes Auditors: ensuring there is no bit-rot Quarantining replicas if checksum mismatch Replicators: ensuring all replicas are stored multiple times on remote nodes (for replication) Reconstructors: recomputing missing erasure-coding fragments (for erasure coding) 27
28 Failure domains Ensuring high availability and durability Three replicas Disk 0 Disk 3 Disk 6 Disk 9 Disk 12 Disk 15 Disk Proxy 1 Disk Proxy 4 Disk Proxy 7 Disk Proxy 10 Disk Proxy 13 Disk Proxy 16 Disk 2 Disk 5 Disk 8 Disk 11 Disk 14 Disk 17 Storage Nodes 28
29 Failure domains Ensuring high availability and durability Three replicas Disk 0 Disk 3 Disk 6 Disk 9 Disk 12 Disk 15 Disk Proxy 1 Disk Proxy 4 Disk Proxy 7 Disk Proxy 10 Disk Proxy 13 Disk Proxy 16 Disk 2 Disk 5 Disk 8 Disk 11 Disk 14 Disk 17 Zone1 Zone2 29
30 Failure domains Ensuring high availability and durability Three replicas Disk 0 Disk 3 Disk 6 Disk 9 Disk 12 Disk 15 Disk Proxy 1 Disk Proxy 4 Disk Proxy 7 Disk Proxy 10 Disk Proxy 13 Disk Proxy 16 Disk 2 Disk 5 Disk 8 Disk 11 Disk 14 Disk 17 Zone1 Zone2 30
31 Failure domains Ensuring high availability and durability Three replicas Disk 0 Disk 3 Disk 6 Disk 9 Disk 12 Disk 15 Disk Proxy 1 Disk Proxy 4 Disk Proxy 7 Disk Proxy 10 Disk Proxy 13 Disk Proxy 16 Disk 2 Disk 5 Disk 8 Disk 11 Disk 14 Disk 17 Zone1 Zone2 Zone3 Region 1 Region 2 31
32 32
33 Re-Balancing To ensure a third replica Disk 0 Disk 3 Disk 6 Disk 9 Disk 12 Disk 15 Disk Proxy 1 Disk Proxy 4 Disk Proxy 7 Disk Proxy 10 Disk Proxy 13 Disk Proxy 16 Disk 2 Disk 5 Disk 8 Disk 11 Disk 14 Disk 17 Zone1 Zone2 Zone3 Region 1 Region 2 33
34 Explore More 34
35 How to build an object storage system Case 2: Ceph 35
36 System Overview 36
37 Key Features Decoupled data and metadata CRUSH Files striped onto predictably named objects CRUSH maps objects to storage devices Dynamic Distributed Metadata Management Dynamic subtree partitioning Distributes metadata amongst MDSs Object-based storage OSDs handle migration, replication, failure detection and recovery 37
38 Client Operation Ceph interface Nearly POSIX Decoupled data and metadata operation User space implementation FUSE or directly linked Filesystem in Userspace (FUSE) is a software interface for Unix-like computer operating systems that lets nonprivileged users create their own file systems without editing kernel code. 38
39 Client Access Example Client sends open request to MDS MDS returns capability, file inode, file size and stripe information Client read/write directly from/to OSDs Client sends close request, and provides details to MDS 39
40 Distributed Metadata Metadata operations often make up as much as half of file system workloads Effective metadata management is critical to overall system performance 40
41 Dynamic Subtree Partitioning Lets Ceph dynamically share metadata workload among tens or hundreds of metadata servers (MDSs) Sharing is dynamic and based on current access patterns Results in near-linear performance scaling in the number of MDSs 41
42 Distributed Object Storage Files are split across objects Objects are members of placement groups Placement groups are distributed across OSDs. 42
43 Ceph firsts maps objects into placement groups (PG) using a hash function Placement groups are then assigned to OSDs using a pseudo-random function (CRUSH) 43
44 CRUSH S. A. Weil, S. A. Brandt, E. L. Miller, and C. Maltzahn. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC 06), Tampa, FL, Nov ACM 44
45 Replication Objects are replicated on OSDs within same PG Primary forwards updates to other replicas Sends ACK to client once all replicas have received the update Slow but safe Replicas send final commit once they have committed update to disk 45
46 Failure Detection and Recovery Down and Out Monitors check for intermittent problems New or recovered OSDs peer with other OSDs within PG 46
47 Conclusion Ceph and Swift share some similar concept, though implemented differently How to identify object (Rings vs. CRUSH) Distribute object evenly (Rings vs. CRUSH) Provide reliability (replication)
48 Erasure Code Replication: Full copies of stored objects Erasure coding: One copy plus parity 48
49 Sources 1. Christian Schwede, Forget everything you knew about Swift Rings, 2. Swift GIU&feature=youtu.be 3. Ceph
Ceph: A Scalable, High-Performance Distributed File System PRESENTED BY, NITHIN NAGARAJ KASHYAP
Ceph: A Scalable, High-Performance Distributed File System PRESENTED BY, NITHIN NAGARAJ KASHYAP Outline Introduction. System Overview. Distributed Object Storage. Problem Statements. What is Ceph? Unified
More informationCeph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long Presented by Philip Snowberger Department of Computer Science and Engineering University
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Software Infrastructure in Data Centers: Distributed File Systems 1 Permanently stores data Filesystems
More informationSolidFire and Ceph Architectural Comparison
The All-Flash Array Built for the Next Generation Data Center SolidFire and Ceph Architectural Comparison July 2014 Overview When comparing the architecture for Ceph and SolidFire, it is clear that both
More informationDeploying Software Defined Storage for the Enterprise with Ceph. PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu
Deploying Software Defined Storage for the Enterprise with Ceph PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu Agenda Yet another attempt to define SDS Quick Overview of Ceph from a SDS perspective
More informationROCK INK PAPER COMPUTER
Introduction to Ceph and Architectural Overview Federico Lucifredi Product Management Director, Ceph Storage Boston, December 16th, 2015 CLOUD SERVICES COMPUTE NETWORK STORAGE the future of storage 2 ROCK
More informationDynamic Metadata Management for Petabyte-scale File Systems
Dynamic Metadata Management for Petabyte-scale File Systems Sage Weil Kristal T. Pollack, Scott A. Brandt, Ethan L. Miller UC Santa Cruz November 1, 2006 Presented by Jae Geuk, Kim System Overview Petabytes
More information-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.
-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St. Petersburg Introduction File System Enterprise Needs Gluster Revisited Ceph
More informationCeph. The link between file systems and octopuses. Udo Seidel. Linuxtag 2012
Ceph OR The link between file systems and octopuses Udo Seidel Agenda Background CephFS CephStorage Summary Ceph what? So-called parallel distributed cluster file system Started as part of PhD studies
More informationOutline. Challenges of DFS CEPH A SCALABLE HIGH PERFORMANCE DFS DATA DISTRIBUTION AND MANAGEMENT IN DISTRIBUTED FILE SYSTEM 11/16/2010
Outline DATA DISTRIBUTION AND MANAGEMENT IN DISTRIBUTED FILE SYSTEM Erin Brady and Shantonu Hossain What are the challenges of Distributed File System (DFS) Ceph: A scalable high performance DFS Data Distribution
More informationCeph Intro & Architectural Overview. Abbas Bangash Intercloud Systems
Ceph Intro & Architectural Overview Abbas Bangash Intercloud Systems About Me Abbas Bangash Systems Team Lead, Intercloud Systems abangash@intercloudsys.com intercloudsys.com 2 CLOUD SERVICES COMPUTE NETWORK
More informationDatacenter Storage with Ceph
Datacenter Storage with Ceph John Spray john.spray@redhat.com jcsp on #ceph-devel Agenda What is Ceph? How does Ceph store your data? Interfaces to Ceph: RBD, RGW, CephFS Latest development updates Datacenter
More informationINTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017
INTRODUCTION TO CEPH Orit Wasserman Red Hat August Penguin 2017 CEPHALOPOD A cephalopod is any member of the molluscan class Cephalopoda. These exclusively marine animals are characterized by bilateral
More informationGlusterFS Architecture & Roadmap
GlusterFS Architecture & Roadmap Vijay Bellur GlusterFS co-maintainer http://twitter.com/vbellur Agenda What is GlusterFS? Architecture Integration Use Cases Future Directions Challenges Q&A What is GlusterFS?
More informationWhat's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT
What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT QUICK PRIMER ON CEPH AND RADOS CEPH MOTIVATING PRINCIPLES All components must scale horizontally There can be no single point of failure The solution
More informationToward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017
Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store Wei Xie TTU CS Department Seminar, 3/7/2017 1 Outline General introduction Study 1: Elastic Consistent Hashing based Store
More informationHandling Big Data an overview of mass storage technologies
SS Data & Handling Big Data an overview of mass storage technologies Łukasz Janyst CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it GridKA School 2013 Karlsruhe, 26.08.2013 What is Big Data?
More informationan Object-Based File System for Large-Scale Federated IT Infrastructures
an Object-Based File System for Large-Scale Federated IT Infrastructures Jan Stender, Zuse Institute Berlin HPC File Systems: From Cluster To Grid October 3-4, 2007 In this talk... Introduction: Object-based
More informationA Gentle Introduction to Ceph
A Gentle Introduction to Ceph Narrated by Tim Serong tserong@suse.com Adapted from a longer work by Lars Marowsky-Brée lmb@suse.com Once upon a time there was a Free and Open Source distributed storage
More informationFLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568
FLAT DATACENTER STORAGE Paper-3 Presenter-Pratik Bhatt fx6568 FDS Main discussion points A cluster storage system Stores giant "blobs" - 128-bit ID, multi-megabyte content Clients and servers connected
More informationCLIP: A Compact, Load-balancing Index Placement Function
CLIP: A Compact, Load-balancing Index Placement Function Michael McThrow Storage Systems Research Center University of California, Santa Cruz Abstract Existing file searching tools do not have the performance
More informationGot Isilon? Need IOPS? Get Avere.
Got Isilon? Need IOPS? Get Avere. Scalable I/O Performance to Complement Any EMC Isilon Environment By: Jeff Tabor, Director of Product Marketing Achieving Performance Scaling Overcoming Random I/O and
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationvirtual machine block storage with the ceph distributed storage system sage weil xensummit august 28, 2012
virtual machine block storage with the ceph distributed storage system sage weil xensummit august 28, 2012 outline why you should care what is it, what it does how it works, how you can use it architecture
More informationRED HAT CEPH STORAGE ROADMAP. Cesar Pinto Account Manager, Red Hat Norway
RED HAT CEPH STORAGE ROADMAP Cesar Pinto Account Manager, Red Hat Norway cpinto@redhat.com THE RED HAT STORAGE MISSION To offer a unified, open software-defined storage portfolio that delivers a range
More informationStaggeringly Large Filesystems
Staggeringly Large Filesystems Evan Danaher CS 6410 - October 27, 2009 Outline 1 Large Filesystems 2 GFS 3 Pond Outline 1 Large Filesystems 2 GFS 3 Pond Internet Scale Web 2.0 GFS Thousands of machines
More informationCeph Rados Gateway. Orit Wasserman Fosdem 2016
Ceph Rados Gateway Orit Wasserman owasserm@redhat.com Fosdem 2016 AGENDA Short Ceph overview Rados Gateway architecture What's next questions Ceph architecture Cephalopod Ceph Open source Software defined
More informationTopics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability
Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What
More informationCSE 153 Design of Operating Systems
CSE 153 Design of Operating Systems Winter 2018 Lecture 22: File system optimizations and advanced topics There s more to filesystems J Standard Performance improvement techniques Alternative important
More informationWeb Object Scaler. WOS and IRODS Data Grid Dave Fellinger
Web Object Scaler WOS and IRODS Data Grid Dave Fellinger dfellinger@ddn.com Innovating in Storage DDN Firsts: Streaming ingest from satellite with guaranteed bandwidth Continuous service to air for a major
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection
More informationSimplifying Collaboration in the Cloud
Simplifying Collaboration in the Cloud WOS and IRODS Data Grid Dave Fellinger dfellinger@ddn.com Innovating in Storage DDN Firsts: Streaming ingest from satellite with guaranteed bandwidth Continuous service
More informationLustre overview and roadmap to Exascale computing
HPC Advisory Council China Workshop Jinan China, October 26th 2011 Lustre overview and roadmap to Exascale computing Liang Zhen Whamcloud, Inc liang@whamcloud.com Agenda Lustre technology overview Lustre
More informationFile System Implementation
File System Implementation Last modified: 16.05.2017 1 File-System Structure Virtual File System and FUSE Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance. Buffering
More informationCeph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Sage A. Weil Scott A. Brandt Ethan L. Miller Darrell D. E. Long Carlos Maltzahn University of California, Santa Cruz {sage, scott, elm, darrell,
More informationFLAT DATACENTER STORAGE CHANDNI MODI (FN8692)
FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) OUTLINE Flat datacenter storage Deterministic data placement in fds Metadata properties of fds Per-blob metadata in fds Dynamic Work Allocation in fds Replication
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationBeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008
Design and build an inexpensive DFS Fabrizio Manfredi Furuholmen FrOSCon August 2008 Agenda Overview Introduction Old way openafs New way Hadoop CEPH Conclusion Overview Why Distributed File system? Handle
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationCeph: scaling storage for the cloud and beyond
Ceph: scaling storage for the cloud and beyond Sage Weil Inktank outline why you should care what is it, what it does distributed object storage ceph fs who we are, why we do this why should you care about
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationDynamo: Amazon s Highly Available Key-Value Store
Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized
More informationDISTRIBUTED STORAGE AND COMPUTE WITH LIBRADOS SAGE WEIL VAULT
DISTRIBUTED STORAGE AND COMPUTE WITH LIBRADOS SAGE WEIL VAULT - 2015.03.11 AGENDA motivation what is Ceph? what is librados? what can it do? other RADOS goodies a few use cases 2 MOTIVATION MY FIRST WEB
More informationUse Distributed File system as a Storage Tier! Fabrizio Manfred Furuholmen!
Use Distributed File system as a Storage Tier! Fabrizio Manfred Furuholmen! Agenda Introduction Next Generation Data Center Distributed File system Distributed File system OpenAFS GlusterFS HDFS Ceph Case
More informationCEPHALOPODS AND SAMBA IRA COOPER SNIA SDC
CEPHALOPODS AND SABA IRA COOPER SNIA SDC 2016.09.18 AGENDA CEPH Architecture. Why CEPH? RADOS RGW CEPHFS Current Samba integration with CEPH. Future directions. aybe a demo? 2 CEPH OTIVATING PRINCIPLES
More informationSoftware-defined Storage: Fast, Safe and Efficient
Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about
More informationChapter 11: File System Implementation. Objectives
Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block
More informationRed Hat Storage Server for AWS
Red Hat Storage Server for AWS Craig Carl Solution Architect, Amazon Web Services Tushar Katarki Principal Product Manager, Red Hat Veda Shankar Principal Technical Marketing Manager, Red Hat GlusterFS
More informationCeph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Sage A. Weil Scott A. Brandt Ethan L. Miller Darrell D. E. Long Carlos Maltzahn University of California, Santa Cruz {sage, scott, elm, darrell,
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationArchive Solutions at the Center for High Performance Computing by Sam Liston (University of Utah)
Archive Solutions at the Center for High Performance Computing by Sam Liston (University of Utah) The scale of the data housed at the Center for High Performance Computing (CHPC) has dramatically increased
More informationEvaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization
Evaluating Cloud Storage Strategies James Bottomley; CTO, Server Virtualization Introduction to Storage Attachments: - Local (Direct cheap) SAS, SATA - Remote (SAN, NAS expensive) FC net Types - Block
More informationA product by CloudFounders. Wim Provoost Open vstorage
A product by CloudFounders Wim Provoost (@wimpers_be) Open vstorage (@openvstorage) http://www.openvstorage.com CloudFounders vrun Converged infrastructure that combines the benefits of the hyperconverged
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationScality RING on Cisco UCS: Store File, Object, and OpenStack Data at Scale
Scality RING on Cisco UCS: Store File, Object, and OpenStack Data at Scale What You Will Learn Cisco and Scality provide a joint solution for storing and protecting file, object, and OpenStack data at
More informationThe File Systems Evolution. Christian Bandulet, Sun Microsystems
The s Evolution Christian Bandulet, Sun Microsystems SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations
More informationTITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP
TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationBenchmark of a Cubieboard cluster
Benchmark of a Cubieboard cluster M J Schnepf, D Gudu, B Rische, M Fischer, C Jung and M Hardt Steinbuch Centre for Computing, Karlsruhe Institute of Technology, Karlsruhe, Germany E-mail: matthias.schnepf@student.kit.edu,
More informationProvisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &
Provisioning with SUSE Enterprise Storage Nyers Gábor Trainer & Consultant @Trebut gnyers@trebut.com Managing storage growth and costs of the software-defined datacenter PRESENT Easily scale and manage
More informationCS 537 Fall 2017 Review Session
CS 537 Fall 2017 Review Session Deadlock Conditions for deadlock: Hold and wait No preemption Circular wait Mutual exclusion QUESTION: Fix code List_insert(struct list * head, struc node * node List_move(struct
More informationStorage in HPC: Scalable Scientific Data Management. Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11
Storage in HPC: Scalable Scientific Data Management Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11 Who am I? Systems Research Lab (SRL), UC Santa Cruz LANL/UCSC Institute for Scalable Scientific
More informationCeph Intro & Architectural Overview. Federico Lucifredi Product Management Director, Ceph Storage Vancouver & Guadalajara, May 18th, 2015
Ceph Intro & Architectural Overview Federico Lucifredi Product anagement Director, Ceph Storage Vancouver & Guadalajara, ay 8th, 25 CLOUD SERVICES COPUTE NETWORK STORAGE the future of storage 2 ROCK INK
More informationTools for Social Networking Infrastructures
Tools for Social Networking Infrastructures 1 Cassandra - a decentralised structured storage system Problem : Facebook Inbox Search hundreds of millions of users distributed infrastructure inbox changes
More informationCluster-Level Google How we use Colossus to improve storage efficiency
Cluster-Level Storage @ Google How we use Colossus to improve storage efficiency Denis Serenyi Senior Staff Software Engineer dserenyi@google.com November 13, 2017 Keynote at the 2nd Joint International
More informationCeph Block Devices: A Deep Dive. Josh Durgin RBD Lead June 24, 2015
Ceph Block Devices: A Deep Dive Josh Durgin RBD Lead June 24, 2015 Ceph Motivating Principles All components must scale horizontally There can be no single point of failure The solution must be hardware
More informationEnterprise Ceph: Everyway, your way! Amit Dell Kyle Red Hat Red Hat Summit June 2016
Enterprise Ceph: Everyway, your way! Amit Bhutani @ Dell Kyle Bader @ Red Hat Red Hat Summit June 2016 Agenda Overview of Ceph Components and Architecture Evolution of Ceph in Dell-Red Hat Joint OpenStack
More informationChanging Requirements for Distributed File Systems in Cloud Storage
Changing Requirements for Distributed File Systems in Cloud Storage Wesley Leggette Cleversafe Presentation Agenda r About Cleversafe r Scalability, our core driver r Object storage as basis for filesystem
More informationDecentralized Distributed Storage System for Big Data
Decentralized Distributed Storage System for Big Presenter: Wei Xie -Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University Outline Trends in Big and Cloud Storage
More informationCloud object storage in Ceph. Orit Wasserman Fosdem 2017
Cloud object storage in Ceph Orit Wasserman owasserm@redhat.com Fosdem 2017 AGENDA What is cloud object storage? Ceph overview Rados Gateway architecture Questions Cloud object storage Block storage Data
More informationIBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage
IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage Silverton Consulting, Inc. StorInt Briefing 2017 SILVERTON CONSULTING, INC. ALL RIGHTS RESERVED Page 2 Introduction Unstructured data has
More informationRed Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015
Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure
More informationWhy software defined storage matters? Sergey Goncharov Solution Architect, Red Hat
Why software defined storage matters? Sergey Goncharov Solution Architect, Red Hat sgonchar@redhat.com AGENDA Storage and Datacenter evolution Red Hat Storage portfolio Red Hat Gluster Storage Red Hat
More informationXtreemFS a case for object-based storage in Grid data management. Jan Stender, Zuse Institute Berlin
XtreemFS a case for object-based storage in Grid data management Jan Stender, Zuse Institute Berlin In this talk... Traditional Grid Data Management Object-based file systems XtreemFS Grid use cases for
More informationAn Introduction to GPFS
IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationCloud object storage : the right way. Orit Wasserman Open Source Summit 2018
Cloud object storage : the right way Orit Wasserman Open Source Summit 2018 1 About me 20+ years of development 10+ in open source: Nested virtualization for KVM Maintainer of live migration in Qemu/kvm
More informationAmbry: LinkedIn s Scalable Geo- Distributed Object Store
Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil
More informationRESAR: Reliable Storage at Exabyte Scale Reconsidered
RESAR: Reliable Storage at Exabyte Scale Reconsidered Thomas Schwarz, SJ, Ahmed Amer, John Rose Marquette University, Milwaukee, WI, thomas.schwarz@marquette.edu Santa Clara University, Santa Clara, CA,
More informationThe Design and Implementation of AQuA: An Adaptive Quality of Service Aware Object-Based Storage Device
The Design and Implementation of AQuA: An Adaptive Quality of Service Aware Object-Based Storage Device Joel Wu and Scott Brandt Department of Computer Science University of California Santa Cruz MSST2006
More informationAnalytics in the cloud
Analytics in the cloud Dow we really need to reinvent the storage stack? R. Ananthanarayanan, Karan Gupta, Prashant Pandey, Himabindu Pucha, Prasenjit Sarkar, Mansi Shah, Renu Tewari Image courtesy NASA
More informationDistributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 15. Distributed File Systems Paul Krzyzanowski Rutgers University Fall 2017 1 Google Chubby ( Apache Zookeeper) 2 Chubby Distributed lock service + simple fault-tolerant file system
More informationPPMS: A Peer to Peer Metadata Management Strategy for Distributed File Systems
PPMS: A Peer to Peer Metadata Management Strategy for Distributed File Systems Di Yang, Weigang Wu, Zhansong Li, Jiongyu Yu, and Yong Li Department of Computer Science, Sun Yat-sen University Guangzhou
More informationEngineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05
Engineering Goals Scalability Availability Transactional behavior Security EAI... Scalability How much performance can you get by adding hardware ($)? Performance perfect acceptable unacceptable Processors
More informationSEP sesam Backup & Recovery to SUSE Enterprise Storage. Hybrid Backup & Disaster Recovery
Hybrid Backup & Disaster Recovery SEP sesam Backup & Recovery to SUSE Enterprise Reference Architecture for using SUSE Enterprise (SES) as an SEP sesam backup target 1 Global Management Table of Contents
More informationDynamic Object Routing
Dynamic Object Routing Balaji Ganesan Bharat Boddu Cloudian 2016 Storage Developer Conference. Insert Your Company Name. All Rights Reserved. HyperStore System Overview 1. Full Amazon S3 API Compatibility,
More informationPRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies
Object Storage 201 PRESENTATION TITLE GOES HERE Understanding Architectural Trade-offs in Object Storage Technologies SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA
More informationDistributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 15. Distributed File Systems Paul Krzyzanowski Rutgers University Fall 2016 1 Google Chubby 2 Chubby Distributed lock service + simple fault-tolerant file system Interfaces File access
More information18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.
18-hdfs-gfs.txt Thu Oct 27 10:05:07 2011 1 Notes on Parallel File Systems: HDFS & GFS 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Ghemawat, Gobioff, Leung, "The Google File
More informationSamba and Ceph. Release the Kraken! David Disseldorp
Samba and Ceph Release the Kraken! David Disseldorp ddiss@samba.org Agenda Ceph Overview State of Samba Integration Performance Outlook Ceph Distributed storage system Scalable Fault tolerant Performant
More informationHDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware
More informationGFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures
GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,
More informationDistributed Computations MapReduce. adapted from Jeff Dean s slides
Distributed Computations MapReduce adapted from Jeff Dean s slides What we ve learnt so far Basic distributed systems concepts Consistency (sequential, eventual) Fault tolerance (recoverability, availability)
More informationDiscover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com
Discover CephFS TECHNICAL REPORT SPONSORED BY image vlastas, 123RF.com Discover CephFS TECHNICAL REPORT The CephFS filesystem combines the power of object storage with the simplicity of an ordinary Linux
More informationDistributed Systems. Tutorial 9 Windows Azure Storage
Distributed Systems Tutorial 9 Windows Azure Storage written by Alex Libov Based on SOSP 2011 presentation winter semester, 2011-2012 Windows Azure Storage (WAS) A scalable cloud storage system In production
More informationEffizientes Speichern von Cold-Data
Effizientes Speichern von Cold-Data Dr. Dirk Gebh Storage Sales Consultant Oracle Deutschland Program Agenda 1 2 3 4 5 Cold-Data OHSM Introduction Use Case Removing Cold Data from Primary Storage OHSM
More informationTake Back Lost Revenue by Activating Virtuozzo Storage Today
Take Back Lost Revenue by Activating Virtuozzo Storage Today JUNE, 2017 2017 Virtuozzo. All rights reserved. 1 Introduction New software-defined storage (SDS) solutions are enabling hosting companies to
More informationSummary optimized CRUSH algorithm more than 10% read performance improvement Design and Implementation: 1. Problem Identification 2.
Several months ago we met an issue of read performance issues (17% degradation) when working on ceph object storage performance evaluation with 10M objects (scaling from 10K objects to 1Million objects),
More informationDistributed File Storage in Multi-Tenant Clouds using CephFS
Distributed File Storage in Multi-Tenant Clouds using CephFS Openstack Vancouver 2018 May 23 Patrick Donnelly CephFS Engineer Red Hat, Inc. Tom Barron Manila Engineer Red Hat, Inc. Ramana Raja CephFS Engineer
More informationUsing Cloud Services behind SGI DMF
Using Cloud Services behind SGI DMF Greg Banks Principal Engineer, Storage SW 2013 SGI Overview Cloud Storage SGI Objectstore Design Features & Non-Features Future Directions Cloud Storage
More information