The Lion of storage systems Rakuten. Inc, Yosuke Hara Mar 21, 2013 1
The Lion of storage systems http://www.leofs.org LeoFS v0.14.0 was released! 2
Table of Contents 1. Motivation 2. Overview & Inside 3. Future Works We seek further growth of LeoFS 3
Motivation 4
Motivation 2010 We need to store and manage huge amount of Files at low-cost 5
Why do we need low-cost storage system? Huge amount of image-files 6
Motivation? Need to move from expensive storage to something Problems: 1. Low ROI 2. Possibility of SPOF 3. Storage Expansion is difficult during increasing data 7
Motivation Face Same Situation Increased in 5 times 8
Introducing LeoFS 9
Object Storage System What kind of storage we need for web-services? 1. ONE-Huge Storage 2. Non-Stop Storage 3. Specialized in the Web Not FUSE But REST-API over HTTP 10
Object Storage System What should we realize? HIGH Availability (Reliability) To Realize HIGH Cost Performance Ratio HIGH Scalability 11
Object Storage System From Photo Storage To Cloud Storage 5MB 100MB a few GB 1st step as Cloud Storage Specialize in Photo 12
Object Storage System From Photo Storage To Cloud Storage 5MB 100MB a few GB Aim to Storage Platform in the Cloud Able to store various unstructured-data 13
Object Storage System For building Storage System S3FS-C 14
Object Storage System In Centralized LeoFS Ratio 15
Object Storage System To Realize Storage Platform GUI-Console LeoTamer QoS LeoDenebola 16
Overview 17
LeoFS Overview 2010 Concurrency Distribution Fault tolerance Using in Telecom, Banking, e-commerce, Instant messaging,... Robust + Scalable Storage System 18
LeoFS Overview LeoFS-Gateway LeoFS-Storage Request from Web Application(s) or Browser Load Balancer Gateway (Stateless Proxy) HTTP Request/Response Handling + w/object Cache S3-API System LeoFS-Manager Management Keep running + Keep Consistency Manager Monitor Storage/Gateway RoutingTable Monitor NodeState Monitor SNMP Storage Engine/Router META Object Store Storage Storage Engine/Router Storage Engine/Router Object Storage, Meta data Storage + Replicator/Recoverer, Queue META Object Store META Object Store GUI Console 19
LeoFS Overview Request from Web Application(s) or Browser Load Balancer No Master No SPOF S3-API LeoFS-Manager LeoFS-Gateway LeoFS-Storage Monitor (SNMP) Storage Engine/Router Storage Engine/Router Storage Engine/Router GUI Console META Object Store META Object Store META Object Store 20
LeoFS Overview - Examples of building LeoFS 1 - Minimum for Development Manager x 1 Gateway x 1 Storage x 1 20-50TB Storage System (# of replicas = 3) Manager x 2 Gateway x 3.. Storage x 8-15 10TB.. 20TB / server 50-300TB Storage System (# of replicas = 3) Manager x 2 Gateway x 4.. Storage x 45-90 10TB.. 20TB / server 21
Storage Platform In Centralized LeoFS Photo-Storage PaaS / IaaS Search / Analysis Application / Log Collector 22
Inside LeoFS 23
LeoFS Architecture HTTP Gateway Object Cache Erlang RPC Erlang RPC Storage Cluster P2P Erlang RPC State/Process Monitor Manager Cluster Mnesia 24
LeoFS Gateway 25
LeoFS Gateway Stateless Proxy From Applications S3-API Cowboy Object Cache [ LRU, Slab allocator, Skip graph ] Consistent Hashing Horizontal Distribution Replicate when using RPC *Cowboy: Erlang light-weight HTTP-Server - http://http://www.ninenines.eu/ 26
LeoFS Gateway - Object Cache Object Cache Behavior Cache Hit {$FilePath, $Checksum} request from Client Match: {ok, match} NOT Match: {ok, $metadata, $body,} response Storage Cluster Object Cache always keep an object consistency 27
LeoFS Gateway - Object Cache (v0.14.0) Hierarchical Object Cache (Using RAM, SSD) HIGH-Performance (Low latency) High I/O efficiency Gateway Primary Cache Secondary Cache Storage Cluster Reduce traffic between Gateway and Storage 28
LeoFS Storage 29
LeoFS Storage Engine Request From Gateway LeoFS Storage storage-engine worker(s) replicator repairer queue... Metadata : Keeps an in-memory index of all data Object Container : Log structured (append-only) file 30
LeoFS Storage Engine Why do we need the original storage engine? 1. Input and Retrieve various data 2. Controllable Compaction 31
LeoFS Storage Engine - Data Structure <Metadata> for Retrieve an File (Object) for Sync {VNodeId, Key} KeySize Custom Meta Size File Size Offset Version Timestamp Checksum <Needle> User-Meta Size Checksum KeySize DataSize Offset Version Timestamp {VNodeId, Key} User-Meta Actual File Footer <Object Container> Header (Metadata - Fixed length) Body (Variable Length) Footer (8B) Super-block Needle-1 Needle-2 Needle-3 Needle-4 Needle-5 32
LeoFS Storage Engine - Retrieve an object from the storage Storage Engine < META META DATA DATA > > ID Id Filename Offset, Size Size Checksum (MD5) Version# Checksum Metadata Storage Header File Object Container Footer 33
LeoFS Storage Engine - Insert an object into the storage Storage Engine Insert a metadata Metadata Storage Data Append an object into the object container 34
LeoFS Storage Engine - Remove unnecessary objects from the storage Storage Engine Compact NEW Object Container/Metadata OLD Object Container/Metadata 35
LeoFS Storage Engine - Remove unnecessary objects from the storage Compaction Manager = FSM Phased Compaction 36
Plan to support auto-compaction with v0.14.2 37
Large object Support 38
Large-object Support (LeoFS v0.12) 1. Able to equalize disk usage of each storage node 2. High I/O efficiency chunk-0 Client chunk-1 chunk-2 chunk-3 Each chunked object and Metadata is replicated Original fils s Metadata #metadata{... dsize = 10485768, cnumber = 4, csize = 3670016,... } Storage Cluster 39
LeoFS Manager 40
LeoFS Manager For Keep High-Availability and Easy System Operation Manager Gateway(s) Monitor RING, Node State Hybrid Strategy = P2P + Manager Operate Storage Cluster P2P status, suspend, resume, detach, whereis,... 41
Request from Web Application(s) or Browser Load Balancer S3-API LeoFS-Manager LeoFS-Gateway LeoFS-Storage Integrated Storage System Monitor (SNMP) Storage Engine/Router Storage Engine/Router Storage Engine/Router GUI Console META Object Store META Object Store META Object Store 42
Brief benchmark report 43
Benchmark LeoFS v0.12.7 44
Benchmark-1 Benchmark-er [2] S3-API [Condition] # of Replica = 3 # of Successful WRITE = 2 # of Successful READ = 1 # of concurrent = 100 LeoFS-Manager LeoFS-Gateway [1] LeoFS-Storage [5] 45
Benchmark-1 Item Value Network 2Gbps (bonding 1Gbps*2) OS Ubuntu 12.04 LTS Server CPU XEON 2.2GHz (2Core/4Thread) HDD RAID-0 / HDD x 6 RAM 16GB Erlang s version R15B03-1 At the start of # of objects 1,000,000 # of replicas 3 # of successful write 2 46
Benchmark-1 READ : WRITE = 8 : 2 OPS Throughput (MB/sec) 2Gbps = 250MB/sec 15,000ops 220.0MB/sec 220.0MB/sec 206.25MB/sec206.26MB/sec210.0MB/sec 230MB/sec 11,250ops 171.86MB/sec 11,000qps 172.5MB/sec 7,500ops 115MB/sec 3,750ops 0ops 3,300qps 1,650qps 420qps 220qps 45qps 16KB 64KB 128KB 512KB 1MB 5MB 57.5MB/sec 0MB/sec 47
Demo 48
Future Works 49
Future Works - LeoFS QoS [Purpose] 1. Able to control request from clients to LeoFS 2. Able to store/see LeoFS s traffic data UDP or [ Input / Notify ] LeoFS QoS LeoFS s Admin Yet Another Redis-Cluster Queue Persistent calculated statistics-data MySQL/NoSQL Report * The quality of service (QoS) refers to several related aspects of telephony and computer networks that allow the transport of traffic with special requirements. 50
Future Works - Multi-Datacenter Replication [Purpose] HIGH-Scalability HIGH-Availability 51
Wrap Up 52
Wrap Up LeoFS GUI-Console LeoFS QoS 53
Q & A 54
Thank you for your time LeoFS - http://www.leofs.org 55
56