CERN Lustre Evaluation

Similar documents
Storage and I/O requirements of the LHC experiments

Batch Services at CERN: Status and Future Evolution

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

The LHC computing model and its evolution. Dr Bob Jones CERN

Distributed e-infrastructures for data intensive science

Oracle Enterprise Manager 12 c : ASH in 3D

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Overview. About CERN 2 / 11

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

LCE: Lustre at CEA. Stéphane Thiell CEA/DAM

CC-IN2P3: A High Performance Data Center for Research

Storage Resource Sharing with CASTOR.

An Overview of Fujitsu s Lustre Based File System

CSCS CERN videoconference CFD applications

Parallel File Systems for HPC

Data Movement & Storage Using the Data Capacitor Filesystem

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

High-Energy Physics Data-Storage Challenges

Grid Computing a new tool for science

HPC IN EUROPE. Organisation of public HPC resources

Lustre/HSM Binding. Aurélien Degrémont Aurélien Degrémont LUG April

An Introduction to GPFS

When Worlds Collide Payments meet mobile Osman Inegol, Mobile Business Unit

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Customers want to transform their datacenter 80% 28% global IT budgets spent on maintenance. time spent on administrative tasks

Five years of OpenStack at CERN

LUG 2012 From Lustre 2.1 to Lustre HSM IFERC (Rokkasho, Japan)

Lustre overview and roadmap to Exascale computing

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

High Performance Computing on MapReduce Programming Framework

CERN and Scientific Computing

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

European Cybersecurity PPP European Cyber Security Organisation - ECSO November 2016

Lustre architecture for Riccardo Veraldi for the LCLS IT Team

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Innovative Fastening Technologies

The IECEE CB Scheme facilitates Global trade of Information Technology products.

EUMETSAT EXPERIENCE WITH MULTICAST ACROSS GÉANT

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

Introducing Panasas ActiveStor 14

From Digital Divide to Digital Inclusion Are we REDI?

Xyratex ClusterStor6000 & OneStor

Datura The new HPC-Plant at Albert Einstein Institute

IEPSAS-Kosice: experiences in running LCG site

Embedded Filesystems (Direct Client Access to Vice Partitions)

Lustre Clustered Meta-Data (CMD) Huang Hua Andreas Dilger Lustre Group, Sun Microsystems

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

HPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing

ATLAS Experiment and GCE

A scalable storage element and its usage in HEP

Using the In-Memory Columnar Store to Perform Real-Time Analysis of CERN Data. Maaike Limper Emil Pilecki Manuel Martín Márquez

Digital EAGLEs. Outlook and perspectives

Map Reconfiguration Dealer Guide

E R T M S COMMUNICATION PLAN

Service withdrawal: Selected IBM ServicePac offerings

Data Transfers Between LHC Grid Sites Dorian Kcira

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

An introduction to GPFS Version 3.3

Volunteer Computing at CERN

Cluster Setup and Distributed File System

European Cybersecurity cppp and ECSO. org.eu

CA485 Ray Walshe Google File System

Travelling securely on the Grid to the origin of the Universe

Developments in Manufacturing Technologies Research Co-operation between Riga Technical University and CERN

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

Cost Saving Measures for Broadband Roll-out

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

General Purpose Storage Servers

Carrier Services. Intelligent telephony. for over COUNTRIES DID NUMBERS. All IP

CERN s Business Computing

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Verron Martina vspecialist. Copyright 2012 EMC Corporation. All rights reserved.

Items exceeding one or more of the maximum weight and dimensions of a flat. For maximum dimensions please see the service user guide.

IBM offers Software Maintenance for additional Licensed Program Products

Map Reconfiguration User Guide

RobinHood Project Status

Parallel File Systems Compared

Grid Computing: dealing with GB/s dataflows

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

2014 VMware Inc. All rights reserved.

WORKSHOP ON ALL WEEE FLOWS 14/02/17 Alberto Canni Ferrari ERP Italy Country General Manager

Cisco Systems Intelligent Storage Networking

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

Rural broadband and its implications for the future of Universal Service. The Israeli Case

International Packets

High-Performance Lustre with Maximum Data Assurance

EUREKA European Network in international R&D Cooperation

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family


Improving digital infrastructure for a better connected Thailand

for the Physics Community

Assessing performance in HP LeftHand SANs

SYS-Config v2.0. The clever way to configure your industrial control components. Request your free copy today

Transcription:

CERN Lustre Evaluation Arne Wiebalck Sun HPC Workshop, Open Storage Track Regensburg, Germany 8 th Sep 2009 www.cern.ch/it

Agenda A Quick Guide to CERN Storage Use Cases Methodology & Initial Findings Current Thoughts & Wish List Conclusion www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 2

Agenda A Quick Guide to CERN Storage Use Cases Methodology & Initial Findings Wish List Conclusion www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 3

A Quick Quide to CERN CERN: European Organization for Nuclear Research Located at the Swiss/French border near Geneva Twenty Member States: Austria Belgium Bulgaria Czech Rep. Denmark Finland France Germany Greece Hungary Italy Netherlands Norway Poland Portugal Slovak Rep. Spain Sweden UK Plus eight Observer States: European Commission, India, Israel, Japan, Russian Federation, Turkey, UNESCO and USA Budget (2008): 1154 MCHF (~715 MEUR) Personnel: 2600 Staff, 700 Fellows/Associates www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 4

Fundamental Laws of Nature Why do particles have mass? Newton could not explain it - and neither can we What is 96% of the Universe made of? We only know 4% of it! Why is there no antimatter left in the Universe? Nature should be symmetrical What was matter like during the first second of the universe s life, right after the "Big Bang"? A journey towards the beginning of time www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 5

LHC 10-34 s 10-10 s 1 s 3 min 300 000 years 1 billion years 15 billion years www.cern.ch/it

CERN s Tools The world s most powerful accelerator: LHC A 27 km long tunnel filled with high-tech instruments Equipped with thousands of superconducting magnets Accelerates particles to energies never before obtained Produces particle collisions creating microscopic big bangs Very large sophisticated detectors Four experiments each the size of a cathedral Hundred million measurement channels each Data acquisition systems treating Petabytes per second Top level computing to distribute and analyse the data A Computing Grid linking ~200 computer centres around the globe Sufficient computing power and storage to handle 15 Petabytes per year, making them available to thousands of physicists for analysis www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 7

Proton Acceleration & Collision Protons are accelerated by several machines up to their final energy (7+7 TeV) Such collisions take place 40 million times per second, day and night, for about 100 days per year Head-on collisions are produced right in the centre of a detector, which records the new particle being produced www.cern.ch/it

www.cern.ch/it The Large Hadron Collider (LHC)

The ATLAS Experiment www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 10

www.cern.ch/it CERN March 2007

www.cern.ch/it CERN March 2007 The LHC Computing Grid

Agenda A Quick Guide to CERN Storage Use Cases Initial Findings Wish List Conclusion www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 13

Storage Use Cases HSM System CERN Advanced STORage Manager (CASTOR) 18.3 PB, 120 million files, 1 200 servers Analysis Space Analysis of the experiments data XRootD plus CASTOR Project Space >150 projects Experiments code (build infrastructure) CVS/SVN, Indico, Twiki, User home directories 20 000 users on AFS 50 000 volumes, 25TB, 1.5 billion acc/day, 50 servers 400 million files www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 14

Why a Lustre Evaluation? Lustre: Performance Scalability HEPiX FSWG CERN: Home directories & projects on AFS CASTOR as HSM XRootD plus CASTOR for analysis Lustre: An opportunity for consolidation? Does Lustre meet the requirements? Can Lustre be operated? www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 15

Agenda A Quick Guide to CERN Storage Use Cases Methodology & Initial Findings Wish List Conclusion www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 16

Project Plan Assemble a list of points to look at Understand requirements of the use cases Manageability Gather information Lustre training Attend LUG Exchange experiences with other sites Get hands-on experience Set up test instances, preprod instance Familiarize with certain functionality Integrate with CERN s tools for fabric management Document the findings www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 17

Use Case Requirements HSM Analysis Space Home Directories/ Analysis Space Generic interface, CASTOR/TSM Scalable Support for random access Low latency access (diskbased) O(1000) open/sec Several GB/s aggregate bandwidth ~10% Backup Mountable plus XrootD access ACLs & quota Strong authentication Backup Wide-area access Small files Availability ACLs & Quota www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 18

Areas of Interest Strong Authentication Backup Fault-tolerance Special performance & Optimization HSM interface Life Cycle Management (LCM) & Tools www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 19

Strong Authentication v1.8: no Kerberos v2.0: early Adaptation, v2.x full Kerberos (late 2010 / early 2011) robust and usable Implementation not (yet) complete PSC early adopter, WAN setup www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 20

Backup Meta data (MDT) LVM snapshots plus tar of EA Orphanfiles, Ghostfiles User data (OST) Crawling the name space With v1.8.x: e2scan? With v2.0.x: changelogs Used in pre-prod instance (to TSM) www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 21

Fault-tolerance Lustre comes with support for failover active/active for OSSs active/passive for MDS Used at other sites Technology Shared storage: FC, DRBD, iscsi Heartbeat www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 22

MDS Test Setup OSS MDS MDT OSS Dell PowerEdge M600 Blade Server 16GB Private iscsi Network Dell Equallogic iscsi Arrays 16x 500GB SATA CLT Fully redundant against component failure iscsi for shared storage Linux device mapper + md for mirroring Performance? OSS: disk server (DAS/iSCSI targets), iscsi arrays www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 23

Special performance Small files No client-side caching, 1MB transfer unit, double lookup How bad is it? Recent improvements (OSS read caches)? Understand tuning options www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 24

HSM CEA (J.-C. Lafoucrière, A. Degrémont) interface design Implementations HPSS, Sun SAM-QFS, Enstore (FNAL)? HSM Mailing list quite active Beta version available later this year www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 25

Life Cycle Management Adding capacity (add an OSS) Possible, but quiesce clients (?!) Allocation policy, coordinated filling Removing capacity (remove an OSS) no user-transparent data migration MDT (re-)sizing Managing Quota Develop procedures / tools Lustre upgrades System upgrades (kernel) www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 26

Agenda A Quick Guide to CERN Storage Use Cases Methodology & Initial Findings Wish List Conclusion www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 27

Wish List Complete the support for strong authentication More control over the system Client-server coupling (recovery) Too powerful users (striping, pools) Stronger Support for Life Cycle Management User transparent data migration File replication Availability, different striping policies Easier maintenance (high level vs. redundant storage) Usable for migration? Outsourcing of privileges www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 28

Current Thoughts New versions bring some helpful features v1.8: VBR, adaptive timeouts, OSS readcache v2.0: Kerberos, change logs, Clustered MDS Increased emphasis on Quality Assurance Milestones-based pre-releases the moving targets problem Responsiveness of Lustre team Interest in non-performance features by other sites www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 29

Summary CERN is looking into Lustre as a candidate for Storage Consolidation Investigation on manageability is missing Input is welcome, share your experiences! arne.wiebalck@cern.ch www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 30

Some links CERN: http://www.cern.ch CASTOR: http://www.cern.ch/castor XrootD: http://xrootd.slac.stanford.edu/ AFS: http://www.openafs.org www.cern.ch/it A. Wiebalck: CERN Lustre Evaluations, Sun HPC Workshop, Regensburg, 8 Sep 2009 31