High-Energy Physics Data-Storage Challenges

Similar documents
Scientific Computing at SLAC

Computing. Richard P. Mount, SLAC. Director, SLAC Computing Services Assistant Director, Research Division. DOE Review

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

BaBar Computing: Technologies and Costs

The CMS Computing Model

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

PROOF-Condor integration for ATLAS

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

Scientific Computing at SLAC. Chuck Boeheim Asst Director: Scientific Computing and Computing Services

Clustering and Reclustering HEP Data in Object Databases

Wide-Area Networking at SLAC. Warren Matthews and Les Cottrell (SCS Network Group) Presented at SLAC, April

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Storage Resource Sharing with CASTOR.

Virtualizing a Batch. University Grid Center

Summary of the LHC Computing Review

ATLAS Experiment and GCE

Data Transfers Between LHC Grid Sites Dorian Kcira

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Grid Computing Activities at KIT

CC-IN2P3: A High Performance Data Center for Research

Data oriented job submission scheme for the PHENIX user analysis in CCJ

CERN and Scientific Computing

The INFN Tier1. 1. INFN-CNAF, Italy

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Access Coordination of Tertiary Storage for High Energy Physics Applications

Overview. About CERN 2 / 11

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

IST346. Data Storage

CSCS CERN videoconference CFD applications

Experimental Computing. Frank Porter System Manager: Juan Barayoga

CSE 124: Networked Services Lecture-17

Towards Network Awareness in LHC Computing

Programmable Information Highway (with no Traffic Jams)

August Li Qiang, Huang Qiulan, Sun Gongxing IHEP-CC. Supported by the National Natural Science Fund

IEPSAS-Kosice: experiences in running LCG site

High Performance Computing on MapReduce Programming Framework

High Performance Computing Course Notes Grid Computing I

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Study of the viability of a Green Storage for the ALICE-T1. Eduardo Murrieta Técnico Académico: ICN - UNAM

Batch Services at CERN: Status and Future Evolution

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software

Storage and I/O requirements of the LHC experiments

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

SAM at CCIN2P3 configuration issues

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties

Benoit DELAUNAY Benoit DELAUNAY 1

Existing Tools in HEP and Particle Astrophysics

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

ICN for Cloud Networking. Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory

Data services for LHC computing

Deduplication Storage System

Tackling tomorrow s computing challenges today at CERN. Maria Girone CERN openlab CTO

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011

Big Data Analytics and the LHC

Oracle EXAM - 1Z Oracle Exadata Database Machine Administration, Software Release 11.x Exam. Buy Full Product

Grid Computing: dealing with GB/s dataflows

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

CERN s Business Computing

CERN Lustre Evaluation

X1 StorNext SAN. Jim Glidewell Information Technology Services Boeing Shared Services Group

Open data and scientific reproducibility

Data Movement & Tiering with DMF 7

Travelling securely on the Grid to the origin of the Universe

IBM Netfinity ServeRAID-3H and -3L Ultra2 SCSI Adapters

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan

arxiv: v1 [physics.ins-det] 1 Oct 2009

HPSS RAIT. A high performance, resilient, fault-tolerant tape data storage class. 1

D0 Grid: CCIN2P3 at Lyon

Gigabyte Bandwidth Enables Global Co-Laboratories

Data Processing and Analysis Requirements for CMS-HI Computing

Spark and HPC for High Energy Physics Data Analyses

A scalable storage element and its usage in HEP

Cluster Setup and Distributed File System

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage

ALICE Grid Activities in US

Table 9. ASCI Data Storage Requirements

Using the In-Memory Columnar Store to Perform Real-Time Analysis of CERN Data. Maaike Limper Emil Pilecki Manuel Martín Márquez

Computing at the Large Hadron Collider. Frank Würthwein. Professor of Physics University of California San Diego November 15th, 2013

The ATLAS EventIndex: Full chain deployment and first operation

Grid Computing a new tool for science

Virtual Security Server

First Experience with LCG. Board of Sponsors 3 rd April 2009

The LHC computing model and its evolution. Dr Bob Jones CERN

LHC and LSST Use Cases

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Managing Petabytes of data with irods. Jean-Yves Nief CC-IN2P3 France

NAS and SAN Scaling Together

Belle & Belle II. Takanori Hara (KEK) 9 June, 2015 DPHEP Collaboration CERN

The LHC Computing Grid

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Transcription:

High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003

Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist) international collaborations 5 10 years detector construction 10 20 years data-taking and analysis Many experiments: Alice, Atlas, BaBar, Belle, CDF, CLEO, CMS, D0, LHCb, PHENIX, STAR BaBar at SLAC Measuring matter-antimatter asymmetry (why we exist?) 500 Physicists Data taking since 1999 More data (~1 petabyte) than any other experiment Fermilab Run II (CDF, D0), RHIC at BNL (STAR, PHENIX) Petabytes soon CERN LHC (Atlas, CMS, Alice) 10s to 100s of petabytes early in the next decade

BaBar Experiment at SLAC

High Data-Rate Devices BaBar SVT

BaBar Collaboration 500 Physicists, 76 Universities/labs, 9 Countries

BaBar At SLAC for BaBar data analysis: 850 Terabyte true database over 1 petabyte of data in total 300 Terabytes of disk storage >1 Teraflop (>2 Teraop) data-intensive compute power 300 Mbits/s sustained of WAN traffic In Europe for BaBar data analysis (CCIN2P3, Padova, RAL) Mirror of much of the SLAC database (in tape robots) 100 Terabytes disk 1 Teraop data-intensive compute power 100s of Mbits/s of dedicated transatlantic bandwidth Europe + North America > 1 Teraop of simulation (to be doubled in next 12 months) All growing with Moore s Law

Are the BaBarians Happy? Typical database queries take days, weeks or months Factor 10, 100 or 1000 performance improvement would revolutionize the science Hardware alone cannot achieve these factors If a 1000-box system delivers an answer in 6 hours, it does not follow that a 60,000-box system will deliver the result in 6 minutes

SLAC Storage Architecture Client Client Client Client Client Client 1200 dual CPU Linux 900 single CPU Sun/Solaris IP Network (Cisco) Objectivity/DB object database + HEP-specific ROOT software 120 dual/quad CPU Sun/Solaris 300 TB Sun FibreChannel RAID arrays IP Network (Cisco) HPSS + SLAC enhancements to Objectivity and ROOT server code Tape Tape Tape Tape Tape 25 dual CPU Sun/Solaris 40 STK 9940B 6 STK 9840A 6 STK Powderhorn over 1 PB of data

Generic Storage Architecture Client Client Client Client Client Client Tape Tape Tape Tape Tape

Large Hadron Collider

CMS Experiment Find the Higgs

Data Management Challenges (1) Sparse access to objects in petabyte databases: Natural object size 1 10 kbytes (and tape) performance dominated by latency Approaches: Hash data over physical disks Instantiate richer database subsets for each analysis application Queue and reorder all disk access requests Keep the hottest objects in (many terabytes of) memory etc.

Data Management Challenges (2) Information management: BaBar has cataloged 60,000,000 collections (database views) Freedom to create any subset dataset or derived dataset is wonderful In a 1000-scientist collaboration the default result is chaos Approaches: Limit freedom by allocating very little space for datasets that are not designed by a committee Catalog all the subset and derived datasets Catalog the way datasets were made Virtual data etc.

Storage Characteristics Capacity Latency Speed Cost

Latency and Speed Random Access Random-Access Storage Performance 1000 100 10 Retreival Rate Mbytes/s 1 0.1 0.01 0.001 0.0001 0.00001 0.000001 PC2100 WD200GB STK9940B 0.0000001 0.00000001 0.000000001 0 1 2 3 4 5 6 7 8 9 10 log10 (Obect Size Bytes)

Latency and Speed Random Access Historical Trends in Storage Performance 1000 100 10 Retrieval Rate MBytes/s 1 0.1 0.01 0.001 0.0001 0.00001 0.000001 PC2100 WD200GB STK9940B RAM 10 years ago 10 years ago Tape 10 years ago 0.0000001 0.00000001 0.000000001 0 1 2 3 4 5 6 7 8 9 10 log10 (Object Size Bytes)

Storage Characteristics Cost Storage Hosted on Network Cost per TB ($k) net after RAID, hot spares etc. Cost per GB/s ($k) Streaming Random access to typically accessed objects Cost per GB/s ($k) Object Size Good Memory * 750 1 18 4 bytes Cheap Memory 250 0.4 6 4 bytes Enterprise SAN maxed out 40 400 8,000 5 kbytes High-quality fibrechannel disk * 10 100 2,000 5 kbytes Tolerable IDE disk 5 50 1,000 5 kbytes Robotic tape (STK 9480C) 1 2000 25,000 500 Mbytes Robotic tape (STK 9940B) * 0.4 2000 50,000 500 Mbytes * Current SLAC choice

Storage-Cost Notes Memory costs per TB are calculated: Cost of memory + host system Memory costs per GB/s are calculated: (Cost of typical memory + host system)/(gb/s of memory in this system) costs per TB are calculated: Cost of disk + server system costs per GB/s are calculated: (Cost of typical disk + server system)/(gb/s of this system) Tape costs per TB are calculated: Cost of media only Tape costs per GB/s are calculated: (Cost of typical server+drives+robotics only)/(gb/s of this server+drives+robotics)

Storage Issues Tapes: Still cheaper than disk for low I/O rates becomes cheaper at, for example, 300MB/s per petabyte for randomaccessed 500 MB files Will SLAC every buy new tape silos?

Storage Issues s: Random access performance is lousy, independent of cost unless objects are megabytes or more Google people say: If you were as smart as us you could have fun building reliable storage out of cheap junk My Systems Group says: Accounting for TCO, we are buying the right stuff

Storage Issues Software: Transparent, scalable access to data: Still waiting for a general-purpose, scalable cluster file system (Lustre?) More application-specific solutions (e.g. Objectivity, ROOT/xrootd work well Information Management BaBar physicists have created millions of data products Automated tracking of data provenance and maximized reuse of data products is becoming a requirement

Workshops on Data Management Sponsored by DOE/MICS March 16-18, 2004 SLAC Focus on application science needs and technology April 20-33, 2004 East Coast or Midwest Focus on computing science and long-term planning for the Office of Science