Bryce Canyon - Facebook's Flexible Hard Drive Storage Architecture

Similar documents
Madhavan Ravi/Hardware Engineer/Facebook Yong Jiang/Storage Engineer/Facebook Saket Karajgikar/Thermal Engineer/Facebook CK Kho/Mechanical

Bryce Canyon Storage System Specification

Xu Wang Hardware Engineer Facebook, Inc.

The Why and How of Developing All-Flash Storage Server

January 28 29, 2014San Jose. Engineering Workshop

FUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a modular form factor. 0 Copyright 2018 FUJITSU LIMITED

Nokia open edge server

All-Flash Storage System

Dell Solution for High Density GPU Infrastructure

Samuli Toivola Lead HW Architect Nokia

Front-loading drive bays 1 12 support 3.5-inch SAS/SATA drives. Optionally, front-loading drive bays 1 and 2 support 3.5-inch NVMe SSDs.

Linkedin Open19 Deep Dive

Sugon TC6600 blade server

Cisco HyperFlex HX220c Edge M5

Compute Engineering Workshop March 9, 2015 San Jose

Cisco HyperFlex HX220c M4 Node

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Cisco MCS 7815-I2. Serviceable SATA Disk Drives

Cisco MCS 7815-I1 Unified CallManager Appliance

October 30-31, 2014 Paris

Release Notes for Cisco Integrated System for Microsoft Azure Stack, Release 1.0. Release Notes for Cisco Integrated System for Microsoft

Cisco MCS 7825-I1 Unified CallManager Appliance

HUAWEI Tecal X6000 High-Density Server

Linkedin Open19 Deep Dive

Cisco MCS 7815-I2 Unified CallManager Appliance

Figure 1 shows the appearance of Huawei FusionServer RH8100 V3 Rack Server. The RH8100 V3 provides three front I/O modules:

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Supports up to four 3.5-inch SAS/SATA drives. Drive bays 1 and 2 support NVMe SSDs. A size-converter

WiRack19 - Computing Server. Wiwynn SV324G2. Highlights. Specification.

CSU 0111 Compute Sled Unit

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

CISCO MEDIA CONVERGENCE SERVER 7815-I1

Evolution of Rack Scale Architecture Storage

Overview. About the Cisco UCS S3260 System

Density Optimized System Enabling Next-Gen Performance

Flexible General-Purpose Server Board in a Standard Form Factor

Powering The Open Data Center Revolution

Next Generation Computing Architectures for Cloud Scale Applications

Overview. About the Cisco UCS S3260 System

Cisco MCS 7816-H3. Supported Cisco Applications. Key Features and Benefits

Cisco UCS S3260 System Storage Management

DNS-2608 Enterprise JBOD Enclosure User Manual

Cisco UCS S3260 System Storage Management

CSU 0201 Compute Sled Unit

Cisco MCS 7825-H3. Supported Cisco Applications

Yuval Bachar. Principal Engineer, Global Infrastructure & Strategy at President, Open19 Foundation

CISCO MEDIA CONVERGENCE SERVER 7825-I1

T-CAP (Converged Appliance Platform)

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

C e l l S i t e G a t e w a y R o u t e r

Full Featured with Maximum Flexibility for Expansion

About 2CRSI. OCtoPus Solution. Technical Specifications. OCtoPus servers. OCtoPus. OCP Solution by 2CRSI.

Telecom Open Rack Concept

About 2CRSI. OCtoPus Solution. Technical Specifications. OCtoPus. OCP Solution by 2CRSI.

Data Sheet FUJITSU Server PRIMERGY CX400 M1 Scale out Server

AI Solution

HP ProLiant DL165 G7 Server

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Dell EMC PowerEdge R7425

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

NVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal

Specifications. Z Microsystems, Inc Summers Ridge Road, San Diego, CA T: F:

SERVER TECHNOLOGY H. Server & Workstation Motherboards Server Barebones & Accessories

The list below shows items not included with a SmartVDI-110 Server. Monitors Ethernet cables (copper) Fiber optic cables Keyboard and mouse

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

January 28 29, 2014 San Jose. Engineering Workshop

NVM Express TM Ecosystem Enabling PCIe NVMe Architectures

SUN BLADE 6000 CHASSIS

Cisco UCS S3260 System Storage Management

Chassis (FRU) Replacement Guide

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

The PowerEdge M830 blade server

HP ProLiant MicroServer

Dell EMC PowerEdge R7415

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 >

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances

Lenovo Database Configuration Guide

15 Jun 2012 Kevin Chang

Cisco MCS 7835-H2 Unified Communications Manager Appliance

Cold Storage Hardware v0.7 ST-draco-abraxas-0.7

Intel Server Board S2600BP Intel Compute Module HNS2600BP Product Family

A FLEXIBLE ARM-BASED CEPH SOLUTION

Intel Server Board S2600BP Intel Compute Module HNS2600BP Product Family

IBM System x servers. Innovation comes standard

Erkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA

Intel Server. Performance. Reliability. Security. INTEL SERVER SYSTEMS

Server Success Checklist: 20 Features Your New Hardware Must Have

An Open Accelerator Infrastructure Project for OCP Accelerator Module (OAM)

Intel Server. Performance. Reliability. Security. INTEL SERVER SYSTEMS FEATURE-RICH INTEL SERVER SYSTEMS FOR PERFORMANCE, RELIABILITY AND SECURITY

QCT Rackgo X Yosemite V2 2018/10/11

Oracle <Insert Picture Here>

Acer AC100 Server Specifications

DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage

FLASHARRAY//M Business and IT Transformation in 3U

UCS Architecture Overview

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Dell EMC PowerEdge R6415

HPE Scalable Storage with Intel Enterprise Edition for Lustre*

SMF PaaS 27 Engine Feature SMF PaaS 30 Capacity Back Up Enablement ziip Engine Feature 1 Per Engine Per Month $200.

About Us. Are you ready for headache-free HPC? Call us and learn more about our custom clustering solutions.

Transcription:

Storage Track Bryce Canyon - Facebook's Flexible Hard Drive Storage Architecture Austin Cousineau Storage Hardware Engineer Facebook

Agenda Flexibility Disaggregation Overview of the Project Why We Built It Improvements Serviceability What s Next Open Source Spirit Design Files 3

How Important is Flexibility? Technology is no longer scaling on the historical trends new architectures and optimizations will be needed Designing a system for flexibility reduces the need for full re-designs Flexibility and modularity allows design re-use Utilizing OCP form factors and concepts allows faster adoption of new hardware and concepts 4

Disaggregation Storage capacity is scaling faster than storage throughput Increased network capacity and flexibility enable distributed systems Dense storage servers with integrated compute allows light weight local services to support disaggregated storage services and resources on separate compute clusters Robust network infrastructure combined with an optimized system configuration enables optimal resource utilization and performance on system, with shared resources on the network 5

What Is Bryce Canyon? Our latest disaggregated storage server and JBOD 4 OU storage server Two distinct storage nodes, each with 36 drives Leverages common 1P Compute Server (Mono Lake) Uses OCP NICs (OCP Mezz) Modular and scalable to meet current and future challenges 6

Bryce Canyon Major System Components Two storage nodes sharing a drive plane board Logically separated front and back 36 drive sets Separate compute cards, expander cards, and IO modules 7

Bryce Canyon Mono Lake & Drive Plane Board DDR4 DIMMs (4 x 32 GB) M.2 SATA Boot Drive CPU Broadwell DE 92 mm Fans x 4 Power Entry Cable Track Storage Controller Card HDD SAS Connector Rear DPB Air Baffle Front DPB Microserver 1P x2 Input Output Module Connector 8

Bryce Canyon Storage Controller Card SAS Expander 12G SAS IOC 12G SCC With IOC & Expander SCC With Expander Only 9

Bryce Canyon Input Output Module (IOM) Top View M.2 Top View IOC TPM Module BMC IC SAS IO Controller 25 Gb/s OCP NIC NVMe M.2 Drives x 2 Front Panel View Front Panel View 10

Why Did We Build Bryce Canyon? Dense, modular design to accommodate different deployment configurations with a single chassis Enhanced serviceability of all major components Design reuse by leveraging existing micro server designs Improved system performance Efficient forced-air cooling for improved CFM/W and service time Maintain HDD performance over all operating conditions 11

How Does It Compare To Honey Badger? Warm Storage Cold Storage Previous Bryce Canyon Previous Bryce Canyon Generation Generation Compute Avoton Broadwell-DE Dual Haswell Broadwell-DE 8 core 16 core 12 core 16 core RAM per Compute 32GB DDR3 64GB DDR4 128GB DDR3 128GB DDR4 HDD per Compute 30 36 240 216 HDDs per Rack 450 576 480 648 SSD Slots (M.2) per Compute Max Network BW per Compute 1 x M.2 SATA 2 x M.2 NVMe 0 0 10Gbps 50Gbps 10Gbps 50Gbps The Warm Storage version of Bryce Canyon provides ~4x compute, 2x DRAM, consumes 30% less power / HDD, and helps achieve ~50% reduction in CFM/W 12

Hard Drive Performance Improvements Performance Degradation (%) Before After Continuous Operation Specification Worst Case Continuous Operation <5% Max <5% Max - 5% Max -1% Max Non-Sustained Operation Specification <10% Max <7% Avg <10% Max <7% Avg Worst Case Non- Sustained Operation -99.7% Max -2.3% Max 13

Hard Drive Performance Improvements Huge improvements in rear HDD degradation o o o More open fan guard Improved fan blade shape Metal honeycomb acoustic attenuator Fan Guard Old Blade Shape New Blade Shape Attenuator 14

System Thermal Improvements Optimized system mechanicals to increase service time and decrease CFM/W o o o Large improvement in rack level CFM/W (0.122 CFM/W @30 C) over Honey Badger systems (0.3 CFM/W @30 C) Service time of 20 minutes in most conditions Improved fans to reduce HDD RV/AV issues 15

System Thermal Improvements Service Time 16

Mechanical Improvements Increased safety factor on extension rails o 198 lb chassis, 325 lb load on front handles Improved tool-less latches for most components Improved drive plane replacement More robust under shock and vibration Tool Free HDD Replacement Easy DPB Replacement 17

Serviceability Common service items can be swapped in under 3 minutes Less time servicing means lower down time Clearly marked modules and Open BMC allow quick diagnosis and replacement NIC Latches SSD Latches SCC Latches 18

Deployment Flexibility Modular design enables easy changes to compute card, NICs, NVMe drives and expander card System is configurable for all in one storage server use, JBOD deployment and JBOD head node configuration o o Supports both warm storage and cold storage System flexibility allows designing of a rack configuration to meet specific application needs Modular sub-systems to scale for new media, technologies and performance requirements 19

What s Next Updating design to accommodate future drive technology Updated compute modules Improved thermal and mechanical performance Increased electrical performance and efficiency Incorporate feedback and lessons from mass deployment 20

Open Source Spirit & Call To Action The Open Compute Project is a powerful community for collaboration and idea sharing Strong participation yields larger gains in technology and design enablement Large scale players can collaborate to steer the industry and suppliers Leveraging lessons from each other enables faster design and reduces duplicated efforts Join and participate in OCP collaboration calls 21

Contributions to OCP Specification v1.0 and Design Package o https://www.opencompute.org/contributions?menu%5bspec_id%5d=s014 7 Where to Buy o https://www.opencompute.org/products/214/wiwynn-bryce-canyonsas12g-storage-server-comprising-2-server-cards-with-up-to-36-hotpluggable-hdds-each OpenBMC Github - https://github.com/facebook/openbmc 22

23