Study of the viability of a Green Storage for the ALICE-T1. Eduardo Murrieta Técnico Académico: ICN - UNAM

Similar documents
Large-scale Archival Storage - a brief overview for the HEP use case -

<Insert Picture Here> Tape Technologies April 4, 2011

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

CS370 Operating Systems

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

DX-Series: Disk Archive to 320 TB with Continuous Backup to LTO, Optical Disc or the Cloud

The personal computer system uses the following hardware device types -

Summary of the LHC Computing Review

I/O CANNOT BE IGNORED

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Exam : Title : Storage Sales V2. Version : Demo

DEDUPLICATION BASICS

Storage Systems. Storage Systems

SXL-6500 Series: LTO Digital Video Archives

Virtualizing a Batch. University Grid Center

Cluster Setup and Distributed File System

Costefficient Storage with Dataprotection

DISK LIBRARY FOR MAINFRAME (DLM)

High-Energy Physics Data-Storage Challenges

SMART SERVER AND STORAGE SOLUTIONS FOR GROWING BUSINESSES

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

Mass-Storage Systems

Replicating Mainframe Tape Data for DR Best Practices

Data storage services at KEK/CRC -- status and plan

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

16/06/56. Secondary Storage. Secondary Storage. Secondary Storage The McGraw-Hill Companies, Inc. All rights reserved.

Your Challenges, Our Solutions. The NAS and/or laptop storage are not enough; how can I upgrade once and expand all?

A FLEXIBLE ARM-BASED CEPH SOLUTION

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Using Simulation to Design Scalable and Cost-Efficient Archival Storage Systems

Effizientes Speichern von Cold-Data

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

SXL-8: LTO-8 Archive System. managed by XenData6 Server software. Eight cartridge library simplifies management of your offline files.

WHICH HARD DRIVE IS RIGHT FOR YOU?

The CMS Computing Model

Chapter 10: Mass-Storage Systems

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

QuickSpecs. HP RDX Removable Disk Cartridges Overview. Models RDX 160GB Removable Disk Cartridge Q2040A RDX 500GB Removable Disk Cartridge

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

QuickSpecs. What's New RDX 750GB Removable Disk Cartridge. HP RDX Removable Disk Cartridges Overview. Models RDX 160GB Removable Disk Cartridge Q2040A

Technical Requirements

CONTENT. Overview 1 LED indication. RAID Mode Selection 3. Quick Installation Guide 4. About RAID RAID function. System Requirements

MD4 esata. 4-Bay Rack Mount Chassis. User Manual February 6, v1.0

QuickSpecs. What's New RDX 1TB Removable Disk Cartridge. HP RDX Removable Disk Cartridges. Overview

Data Storage. Paul Millar dcache

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

2TB DATA SHEET Preliminary

IBM System Storage DS8000 series (Machine types 2421, 2422, 2423, and 2424) delivers new security, scalability, and business continuity capabilities

SXL-8500 Series: LTO Digital Video Archives

XenData SXL- 4200N Archive System: Getting Started & Library Operation Basics. Document last updated: November 30, 2017.

Introduction to NetApp E-Series E2700 with SANtricity 11.10

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Taurus Super-S LCM. Dual-Bay RAID Storage Enclosure for two 3.5 Serial ATA Hard Drives. User Manual July 27, v1.2

VX3000-E Unified Network Storage

Warehouse-Scale Computing

Choosing Hardware and Operating Systems for MySQL. Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc

Operating Systems. Mass-Storage Structure Based on Ch of OS Concepts by SGG

Data Movement & Tiering with DMF 7

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

EMC SYMMETRIX VMAX 40K STORAGE SYSTEM

EMC SYMMETRIX VMAX 40K SYSTEM

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

CS370 Operating Systems

IBM TotalStorage 2104 Expandable Storage Plus 320 offers new high-performance disk drive capacity

HP VMA-series Memory Arrays

Grid Computing Activities at KIT

ZFS User Conference Large Scale Homelab Backups Mike Trogni

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

Cisco MCS 7845-H1 Unified CallManager Appliance

Modular tape library. StorageTek SL150. Simple and scalable data storage. Key features. Key benefits. Hotline

Frequently asked questions from the previous class survey

SSD Architecture Considerations for a Spectrum of Enterprise Applications. Alan Fitzgerald, VP and CTO SMART Modular Technologies

VX1800 Series Unified Network Storage

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

The INFN Tier1. 1. INFN-CNAF, Italy

Vendor must indicate at what level its proposed solution will meet the College s requirements as delineated in the referenced sections of the RFP:

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Independent Electricity System Operator Rapid Migration of Big Data - Oracle DB using IBM Enterprise Storage Tools

Taurus Super-S Combo

SXL-4205Q LTO-8 Digital Archive

Exactly as much as you need.

IBM Storwize V5000 disk system

PASIG Disk Trends. Oracle Storage Technology 101 Session. Philippe Deverchère EMEA Storage CTO. September 16, 2014

USB 3.0 esata Hard Drive Docking Station with Cooling Fan

Low-cost BYO Mass Storage Project

Storage Resource Sharing with CASTOR.

Cisco UCS S3260 System Storage Management

Pharmacy college.. Assist.Prof. Dr. Abdullah A. Abdullah

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan

Taurus Mini Super-S3. Dual-Bay RAID Storage Enclosure for two 2.5-inch Serial ATA Hard Drives. User Manual March 31, 2014 v1.1

All-Flash Storage Solution for SAP HANA:

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage

Data Sheet FUJITSU Storage ETERNUS CS8000 Datacenter solution for backup automation

dcache tape pool performance Niklas Edmundsson HPC2N, Umeå University

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

Solution Brief: XenData Digital Video Archives in a Dalet Environment

CS370 Operating Systems

Transcription:

Study of the viability of a Green Storage for the ALICE-T1 Eduardo Murrieta Técnico Académico: ICN - UNAM

Objective To perform a technical analysis of the viability to replace a Tape Library for a Disk Only based solution

Motivation CERN uses a hierarchic scheme for the management of its data. Classified by different levels of Tiers, each one with different technical requirements and services

Tier-0 CERN Data Center Collects all data generated by detectors Do a first reconstruction of the data Distribute the raw and reconstructed data to Tier-1 It has the custodial of all the information, past, present and future; generated by the LHC and other experiments. Around 100 PB of capacity installed

Tier-1 Primary responsible for safe keeping a copy of the CERN raw and reconstructed data. Distributes data to Tier-2 centers and keep safe a copy of the data generated by the Tier-2. Anual availability > 98 %, fast response time on failures. Must have a 10 Gb connectivity to Tier-0. A combination of mass storage based on hard drives and tapes is a requirement for an optimal performance and long term custodial of the Tier-0 data. 13 sites around the world: only 2 in America in USA.

Tier-2 Provides sufficient computing power to do tasks of production and reconstruction. Provides sufficient storage for temporal and long term backup of data. Annual average availability > 95% UNAM has one Tier-2 data center for the ALICE experiment. 456 TB of storage in disks 1024 cores Te objective is to scale this Tier-2 data center to a Tier-1 10 Gb connectivity to CERN it is now possible from ICN The minimal requirements are 2 PB of raw capacity in tapes scalable up to 10 PB for RUN-3

Tape Library A tape library is a robotic system that automates the management of tape cartridges; from hundred to thousands of tapes. The most distinctive attribute of a tape is its low energy requirements as a backup system.

Storage media MECHANICAL GREEN DRIVES HDD SOLID STATE DRIVES SSD MAGNETIC TAPES CAPACITY Upto 8TB 1.6TB (->16TB) 8.5 TB (40 TB) POWER 8W 5.2 W 0 W (20 W) PERFORMANCE 190 MB/s 460 MB/s 252 MB/s RELATION $/TB ~USD $30 / TB ~USD $300 / TB ~USD $8 / TB GUARANTEE 3 years 5 years 30 years

Power consumption of the media Tapes The media without access do not require any energy The drive (reader) required to access the media goes from 12 to 20 watts. Hard drives Energy requirements varies according to the operation mode of the drive Active mode: ~9 Watts Standby mode: ~7 Watts Sleep mode: 0.5 Watts New technology of Green Drives can save energy by commuting between operation modes This process reduces the life time of the drive.

Testbed for Green Drives 4 WD Intellipower hard drives 3 TB each 5200 RPM 8s idle before switching to Standby power mode 1 WD formatted with ext4 1 WD formatted with xfs Both mounted No access to the drives 2 WD as system's drives with ext4 In normal use in a compute node A Ganglia's python module was made to check the smart counters

Green disk duty-cycle Maximum Load/Unload (active-sleep mode) cycles: 300,000 Default parking time after 8s of inactivity Guarantee 2 years => 2 years of work at 24x7 2 y * 365 d/y * 24 h/d * 60 m/h * 60 s / 300,000 = 210 s/cycle ~ 1 parking cycle every 3.5 minutes In order to use a Green Drive as a long term backup media, the filesystem synchronization must be very infrequent when there is no read/write operations.

Life time of the media Drive died after 151 days SMART counters: Power On hours : 3644 Load_Cycle_Count: 333112 This doesn t means that drives are defective, but that they are used incorrectly. In order to use a Green Drive as a backup media the access pattern must be the correct.

Median life time of a hard drive Backblaze Study based on 25,000 drives for 4 years Desktop hard drives with 3 years guarantee https://www.backblaze.com/blog/how-long-do-disk-drives-last

Power demand Appart of the media requirements of energy, each solution has associated an infrastructure that requires energy Temperature, humidity and dust control Tapes are more sensible to this factors than disks UPS Backup energy Platform dependent energy Motherboards, Enclosures, Tape readers, etc.

Scalability One important factor when selecting a backup system is its possibility to increase its capacity as the need for space increase in time. For a tape library its scalability possibilities are defined at the beginning. So it is important to define which is the maximum expected space required in the future. How do this impact in the cost at the first purchase? For a disk based solution, the limit is imposed mainly by the middleware that support the filesystem. EOS at the moment can manage hundred of Petabytes. For an ALICE-T1 a tape storage of 2 PB are required and 10 PB will be needed for RUN-3

EOS as a custodial system Open source distributed disk storage system Use commodity hardware Scalability: hundred of Petabytes Redundancy by replicas and RAID-like levels (reduce effective total capacity) General purpose file system. Combined with SMR drives and power-saving capabilities reduce the total cost per terabyte.

Cost estimation 1 PB EOS Oracle with StorageTek IBM HP Quantum SPECTRA Green SL3000 TS3500 ESL G3 I6000 T950 Drives Tape Drives $67,000 +media $134,000 $162,354 $268,477 $301,563 $219,344 $229,470 +library Tape library costs obtained from an Oracle study of 2013. EOS solution based on actual quotes.

Conclusions A solution based on Green Hard Drives seems to be a good replacement for a Tape Library System if: The access pattern to the drives preserve them with small periods of high activity mixed with long periods of low activity The access pattern also influence on the power consumption on the solution Tape Libraries are cost effective if the amount of storage is in the order of 10 PB according to CERN experts.

Next steps ICN will soon have a 10 Gb fiber connection to USA This accomplish the network requirement for a T1 In the next months ICN will install a small EOS solution based on Green Drives to test the behavior in terms of energy and pattern access of a backup system for an ALICE s T1 Quotes for Tape Libraries for 2PB scalable to 10PB will be requested to providers in order to have a real comparison between solutions. If Green Storage is cost effective, a proposal to the Alice data group will be done in order to replace the Tape Library for a Disk based solution and install a T1 at UNAM.

Report of the production of the ALICE s UNAM-T2 Luciano Díaz Eduardo Murrieta ICN - UNAM Arión Pérez DGTIC - UNAM

Computing Resources ~78 % average use

CPU time contributed last year UNAM contributed: 6.243 M CPU hours UNAM-T1 DGTIC UNAM ICN

Storage resources UNAM-T2 Wrong units must be MB/s

Storage resources UNAM-T2