Kinetic Open Storage Platform: Enabling Break-through Economics in Scale-out Object Storage PRESENTATION TITLE GOES HERE Ali Fenn & James Hughes Seagate Technology
2020: 7.3 Zettabytes 56% of total = in the cloud 90% Unstructured data 6.5 Zettabytes of unstructured data stored in the cloud 2010: 100 Exabytes 2
How big is 6.5 Zettabytes? 1 inch If you stacked 4TB hard drives side by side, they would circle the earth 3
6.5 Zettabytes Using today s architecture, it could cost cloud data centers: >$240,000,000,000/ye ar in CAPEX + 1 Year OPEX 4
The Opportunity for Change How do we get there? Ecosystem = Open and Software- Defined HDFS CEP H Where do we go from here? 5
Look at Legacy Opportunity Server Storage Application File System DB POSIX File System Volume Manager Driver RAID 1986 POSIX 1988 NTFS 1993 1990s to 2000s XFS 1993 Storage Server RAID Battery Backed RAM CACHE FC SAS Devices SAS Interface SMR, Mapping Cylinder, Head, Sector Drive HDA 6
Storage to Enable New World Use Cases Application File System DB Standard Device Recording POSIX File System Volume Manager Driver Storage Server RAID Battery Backed RAM CACHE F C Devices SAS Interface SMR, Mapping Cylinder, Head, Sector Drive HDA SAS Key/Value Interface Ethernet Connectivity 7
The Kinetic Open Storage Platform Open Source Key/Value API and libraries Open Source Interface Specification Object storage software partners Systems partners Storage now fully disaggregated from compute 8
Simplifying Storage Advancements 4K Sector Transitions = Greater Agility Shingled Magnetic Recording Advanced Management Data Security 9
Performance Opportunities Performance Raw throughput IO utilization Data streams to drive as written Drive handles space mgmt. (no file system metadata) 10
TCO Impact + + + Deploying a Kinetic based architecture could deliver: Up to 50% lower TCO 11
Single Drive 1 Additional Chip to Start 12
Libraries, API enable Applications Application Clustering Management Proprietary to System Vendor C++, Java, Python, Erlang, DIY Interconnect ProtoBuf TCP/IP/GbE GPL Standard Storage Proprietary to Seagate 13
Multiple Masters Application Clustering Management Proprietary to System Vendor Interconnect ProtoBuf TCP/IP/GbE GPL Standard Storage Proprietary to Seagate 14
Multiple Drives, P2P Operations Application Clustering Management Proprietary to System Vendor Interconnect ProtoBuf TCP/IP/GbE GPL Standard Storage Proprietary to Seagate 15
Goals of the Kinetic API Data movement Get/put/delete/getnext/getprevious Versioned (== for success), options Multiple masters Authentication/Integrity/Authorization Cluster-able Simple cluster configuration version enforcement 3rd party copy Management 16
Management (System Vendor) Configures the drive Network Authorized clients Monitors Health Statistics Logs Initiates recovery Change cluster version 3rdPartyCopy 17 17
Standard HDD Form Factor Connector re-pinned Two Ethernet connections Connector 18
System Implications No new ports Ethernet v. SAS switch 19
Network Implications No impact to Data Center Networking Traditional Architecture Kinetic Architecture 20
Performance Opportunities 21
Map of Operations 22
Performance Expectations Same normal performance expectations Sequential Write: 50 MB/s Random Write: 50 MB/s Sequential Read: 50 MB/s Random Read: 1.2x slower than traditional drives 23
Write Performance Results [PRELIMINARY] 24
Swift - Traditional 2 25
Swift - Kinetic 2 26
Swift - Kinetic 2 27
Basho Riak 2 28
Basho Riak - Kinetic 2 29
HDFS - Kinetic 30 30
Scality Kinetic Model Direct data path from clients to kinetic drives Native support for file, object and block Geo distribution across multiple sites Mix of replication and erasure coding Geo distributed metadata cluster 31
Kinetic Fits All Scale-Out Storage Object Storage Cloud Storage, Cloud backup, Cloud Archive / Cold Storage (Open Stack Swift, S3, Riak CS) Distributed File System Architectures Hadoop Distributed File System (HDFS), Google File System (GFS), Ceph, Windows Distributed File System (DFS), FhGFS, GlusterFS, Lustre Distributed Database and Memory Systems No SQL: Cassandra, Voldemort, Riak Memory: Memcached 32
Summary The Kinetic Open Storage Platform: Lowers TCO Disaggregates storage from compute Improves performance Increases innovation agility and efficiency More info at: developers.seagate.com 33