The Lion of storage systems

Similar documents
CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-17

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

CSE 124: Networked Services Fall 2009 Lecture-19

Google is Really Different.

GFS-python: A Simplified GFS Implementation in Python

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

The Google File System

Distributed Filesystem

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. February 22, Prof. Joe Pasquale

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

The Google File System

The Google File System

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

The Google File System

CA485 Ray Walshe Google File System

Distributed File Systems II

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Discover the all-new CacheMount

The Google File System. Alexandru Costan

Using Transparent Compression to Improve SSD-based I/O Caches

Distributed System. Gang Wu. Spring,2018

MapReduce. U of Toronto, 2014

Dynamic Object Routing

The Google File System

TrafficDB: HERE s High Performance Shared-Memory Data Store Ricardo Fernandes, Piotr Zaczkowski, Bernd Göttler, Conor Ettinoffe, and Anis Moussa

The Google File System

A New Key-Value Data Store For Heterogeneous Storage Architecture

Distributed Systems 16. Distributed File Systems II

Abstract. 1. Introduction. 2. Design and Implementation Master Chunkserver

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Software-defined Storage: Fast, Safe and Efficient

CS 655 Advanced Topics in Distributed Systems

CLOUD-SCALE FILE SYSTEMS

TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

File Systems. Before We Begin. So Far, We Have Considered. Motivation for File Systems. CSE 120: Principles of Operating Systems.

Dell EMC CIFS-ECS Tool

Google File System 2

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

GFS: The Google File System

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

Google File System. Arun Sundaram Operating Systems

SCALING COUCHDB WITH BIGCOUCH. Adam Kocoloski Cloudant

November 7, DAN WILSON Global Operations Architecture, Concur. OpenStack Summit Hong Kong JOE ARNOLD

Cloudian Sizing and Architecture Guidelines

Next-Generation Cloud Platform

Yahoo Traffic Server -a Powerful Cloud Gatekeeper

Windows Azure Services - At Different Levels

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Google File System, Replication. Amin Vahdat CSE 123b May 23, 2006

BigData and Map Reduce VITMAC03

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

GFS: The Google File System. Dr. Yingwu Zhu

HydraFS: a High-Throughput File System for the HYDRAstor Content-Addressable Storage System

The Google File System (GFS)

White paper Version 3.10

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale

클라우드스토리지구축을 위한 ceph 설치및설정

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Reliable Distributed Messaging with HornetQ

The SHARED hosting plan is designed to meet the advanced hosting needs of businesses who are not yet ready to move on to a server solution.

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches

McAfee Network Security Platform 8.1

SCALE AND SECURE MOBILE / IOT MQTT TRAFFIC

Remote Procedure Call. Tom Anderson

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

The Google File System

NPTEL Course Jan K. Gopinath Indian Institute of Science

CS 550 Operating Systems Spring File System

Streaming Log Analytics with Kafka

Using Cloud Services behind SGI DMF

Data Centers. Tom Anderson

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

Hedvig as backup target for Veeam

System Specification

Algorithms and Data Structures for Efficient Free Space Reclamation in WAFL

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

Yuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013

Adaptec MaxIQ SSD Cache Performance Solution for Web Server Environments Analysis

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

Beyond 1001 Dedicated Data Service Instances

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

File Systems: Fundamentals

HPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing

Transcription:

The Lion of storage systems Rakuten. Inc, Yosuke Hara Mar 21, 2013 1

The Lion of storage systems http://www.leofs.org LeoFS v0.14.0 was released! 2

Table of Contents 1. Motivation 2. Overview & Inside 3. Future Works We seek further growth of LeoFS 3

Motivation 4

Motivation 2010 We need to store and manage huge amount of Files at low-cost 5

Why do we need low-cost storage system? Huge amount of image-files 6

Motivation? Need to move from expensive storage to something Problems: 1. Low ROI 2. Possibility of SPOF 3. Storage Expansion is difficult during increasing data 7

Motivation Face Same Situation Increased in 5 times 8

Introducing LeoFS 9

Object Storage System What kind of storage we need for web-services? 1. ONE-Huge Storage 2. Non-Stop Storage 3. Specialized in the Web Not FUSE But REST-API over HTTP 10

Object Storage System What should we realize? HIGH Availability (Reliability) To Realize HIGH Cost Performance Ratio HIGH Scalability 11

Object Storage System From Photo Storage To Cloud Storage 5MB 100MB a few GB 1st step as Cloud Storage Specialize in Photo 12

Object Storage System From Photo Storage To Cloud Storage 5MB 100MB a few GB Aim to Storage Platform in the Cloud Able to store various unstructured-data 13

Object Storage System For building Storage System S3FS-C 14

Object Storage System In Centralized LeoFS Ratio 15

Object Storage System To Realize Storage Platform GUI-Console LeoTamer QoS LeoDenebola 16

Overview 17

LeoFS Overview 2010 Concurrency Distribution Fault tolerance Using in Telecom, Banking, e-commerce, Instant messaging,... Robust + Scalable Storage System 18

LeoFS Overview LeoFS-Gateway LeoFS-Storage Request from Web Application(s) or Browser Load Balancer Gateway (Stateless Proxy) HTTP Request/Response Handling + w/object Cache S3-API System LeoFS-Manager Management Keep running + Keep Consistency Manager Monitor Storage/Gateway RoutingTable Monitor NodeState Monitor SNMP Storage Engine/Router META Object Store Storage Storage Engine/Router Storage Engine/Router Object Storage, Meta data Storage + Replicator/Recoverer, Queue META Object Store META Object Store GUI Console 19

LeoFS Overview Request from Web Application(s) or Browser Load Balancer No Master No SPOF S3-API LeoFS-Manager LeoFS-Gateway LeoFS-Storage Monitor (SNMP) Storage Engine/Router Storage Engine/Router Storage Engine/Router GUI Console META Object Store META Object Store META Object Store 20

LeoFS Overview - Examples of building LeoFS 1 - Minimum for Development Manager x 1 Gateway x 1 Storage x 1 20-50TB Storage System (# of replicas = 3) Manager x 2 Gateway x 3.. Storage x 8-15 10TB.. 20TB / server 50-300TB Storage System (# of replicas = 3) Manager x 2 Gateway x 4.. Storage x 45-90 10TB.. 20TB / server 21

Storage Platform In Centralized LeoFS Photo-Storage PaaS / IaaS Search / Analysis Application / Log Collector 22

Inside LeoFS 23

LeoFS Architecture HTTP Gateway Object Cache Erlang RPC Erlang RPC Storage Cluster P2P Erlang RPC State/Process Monitor Manager Cluster Mnesia 24

LeoFS Gateway 25

LeoFS Gateway Stateless Proxy From Applications S3-API Cowboy Object Cache [ LRU, Slab allocator, Skip graph ] Consistent Hashing Horizontal Distribution Replicate when using RPC *Cowboy: Erlang light-weight HTTP-Server - http://http://www.ninenines.eu/ 26

LeoFS Gateway - Object Cache Object Cache Behavior Cache Hit {$FilePath, $Checksum} request from Client Match: {ok, match} NOT Match: {ok, $metadata, $body,} response Storage Cluster Object Cache always keep an object consistency 27

LeoFS Gateway - Object Cache (v0.14.0) Hierarchical Object Cache (Using RAM, SSD) HIGH-Performance (Low latency) High I/O efficiency Gateway Primary Cache Secondary Cache Storage Cluster Reduce traffic between Gateway and Storage 28

LeoFS Storage 29

LeoFS Storage Engine Request From Gateway LeoFS Storage storage-engine worker(s) replicator repairer queue... Metadata : Keeps an in-memory index of all data Object Container : Log structured (append-only) file 30

LeoFS Storage Engine Why do we need the original storage engine? 1. Input and Retrieve various data 2. Controllable Compaction 31

LeoFS Storage Engine - Data Structure <Metadata> for Retrieve an File (Object) for Sync {VNodeId, Key} KeySize Custom Meta Size File Size Offset Version Timestamp Checksum <Needle> User-Meta Size Checksum KeySize DataSize Offset Version Timestamp {VNodeId, Key} User-Meta Actual File Footer <Object Container> Header (Metadata - Fixed length) Body (Variable Length) Footer (8B) Super-block Needle-1 Needle-2 Needle-3 Needle-4 Needle-5 32

LeoFS Storage Engine - Retrieve an object from the storage Storage Engine < META META DATA DATA > > ID Id Filename Offset, Size Size Checksum (MD5) Version# Checksum Metadata Storage Header File Object Container Footer 33

LeoFS Storage Engine - Insert an object into the storage Storage Engine Insert a metadata Metadata Storage Data Append an object into the object container 34

LeoFS Storage Engine - Remove unnecessary objects from the storage Storage Engine Compact NEW Object Container/Metadata OLD Object Container/Metadata 35

LeoFS Storage Engine - Remove unnecessary objects from the storage Compaction Manager = FSM Phased Compaction 36

Plan to support auto-compaction with v0.14.2 37

Large object Support 38

Large-object Support (LeoFS v0.12) 1. Able to equalize disk usage of each storage node 2. High I/O efficiency chunk-0 Client chunk-1 chunk-2 chunk-3 Each chunked object and Metadata is replicated Original fils s Metadata #metadata{... dsize = 10485768, cnumber = 4, csize = 3670016,... } Storage Cluster 39

LeoFS Manager 40

LeoFS Manager For Keep High-Availability and Easy System Operation Manager Gateway(s) Monitor RING, Node State Hybrid Strategy = P2P + Manager Operate Storage Cluster P2P status, suspend, resume, detach, whereis,... 41

Request from Web Application(s) or Browser Load Balancer S3-API LeoFS-Manager LeoFS-Gateway LeoFS-Storage Integrated Storage System Monitor (SNMP) Storage Engine/Router Storage Engine/Router Storage Engine/Router GUI Console META Object Store META Object Store META Object Store 42

Brief benchmark report 43

Benchmark LeoFS v0.12.7 44

Benchmark-1 Benchmark-er [2] S3-API [Condition] # of Replica = 3 # of Successful WRITE = 2 # of Successful READ = 1 # of concurrent = 100 LeoFS-Manager LeoFS-Gateway [1] LeoFS-Storage [5] 45

Benchmark-1 Item Value Network 2Gbps (bonding 1Gbps*2) OS Ubuntu 12.04 LTS Server CPU XEON 2.2GHz (2Core/4Thread) HDD RAID-0 / HDD x 6 RAM 16GB Erlang s version R15B03-1 At the start of # of objects 1,000,000 # of replicas 3 # of successful write 2 46

Benchmark-1 READ : WRITE = 8 : 2 OPS Throughput (MB/sec) 2Gbps = 250MB/sec 15,000ops 220.0MB/sec 220.0MB/sec 206.25MB/sec206.26MB/sec210.0MB/sec 230MB/sec 11,250ops 171.86MB/sec 11,000qps 172.5MB/sec 7,500ops 115MB/sec 3,750ops 0ops 3,300qps 1,650qps 420qps 220qps 45qps 16KB 64KB 128KB 512KB 1MB 5MB 57.5MB/sec 0MB/sec 47

Demo 48

Future Works 49

Future Works - LeoFS QoS [Purpose] 1. Able to control request from clients to LeoFS 2. Able to store/see LeoFS s traffic data UDP or [ Input / Notify ] LeoFS QoS LeoFS s Admin Yet Another Redis-Cluster Queue Persistent calculated statistics-data MySQL/NoSQL Report * The quality of service (QoS) refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements. 50

Future Works - Multi-Datacenter Replication [Purpose] HIGH-Scalability HIGH-Availability 51

Wrap Up 52

Wrap Up LeoFS GUI-Console LeoFS QoS 53

Q & A 54

Thank you for your time LeoFS - http://www.leofs.org 55

56