Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Size: px
Start display at page:

Download "Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium"

Transcription

1 Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

2 Orientation Who are the lunatics? What are their requirements? Why is this interesting to the Storage Industry? What is SNIA doing about this? Conclusions

3 Who are the Lunatics? DoE Accelerated Strategic Computing Initiative (ASCI)! BIG data, locally and widely distributed, high bandwidth access, relatively few users, secure, short-term retention High Energy Physics (HEP) Fermilab, CERN, DESY! BIG data, locally distributed, widely available, moderate number of users, sparse access, long-term retention NASA Earth Observing System Data Information Systems (EOSDIS)! Moderately sized data, locally distributed, widely available, large number of users, very long-term retention DoD NSA! Lots of little data trillions of files, locally distributed, relatively few users, secure, long-term retention DoD Army High Performance Computing Centers and Naval Research Center! BIG data, locally and widely distributed, relatively few users, high bandwidth access, secure, very long term reliable retention

4 A bit of History 1990 Supercomputer Centers operating with HUGE disk farms of GB! 1990 Laptop computers have 50MB internal disk drives! 1992 Fast/wide SCSI runs at break-necking speeds of 20 MB/sec! 1994 Built a 1+TB array of disks with a single SGI xfs file system and wrote a single 1TB file! Used 4GB disks in 7+1 RAID 5 disk arrays! 36 disk arrays mounted in 5 racks 1997 ASCI Mountain Blue - 75TB distributed 2002 ASCI Q 700TB online, high performance, pushing limits of traditional [legacy] block-based file systems

5 The not-too-distant Future 2004 ASCI Red Storm 240TB online, high bandwidth, massively parallel 2005 ASCI Purple 3000TB online, high performance, OSD/Lustre 2006 NASA RDS 6000TB online, global access, CAS,OSD, Data Grids, Lustre? 2007 DoE Fermi Lab / CERN 3 PB/year online / nearline, global sparse access 2010 Your laptop will have a 1TB internal disk that will still be barley adequate for MS Office

6 DoE ASCI 1998 Mountain Blue Los Alamos! Processor SGI Origin 2000 systems! 75TB disk storage 2002 Q! processor machines processor I/O nodes! GB FC connections to 64 I/O nodes! GB FC connections to disk storage subsystem! 692 TB disk storage, 20GB/sec bandwidth! 2 file systems of 346GB each! 4 file system layers between the application! and the disk media 2004 Red Storm! 10,000 processors, 10TB Main Memory! 240TB Disk, 50 GB/sec bandwidth

7 DoE ASCI Purple Requirements Parallel I/O Bandwidth - Multiple (up to 60,000) clients access one file at hundreds of GB/sec. Support for very large (multi-petabyte) file systems Single files of multi-terabyte size must be permitted. Scalable file creation & Metadata Operations! Tens of Millions of files in one directory! Thousands of file creates per second within the same directory Archive Driven Performance - The file system should support high bandwidth data movement to tertiary storage. Adaptive Pre-fetching - Sophisticated pre-fetch and write-behind schemes are encouraged, but a method to disable them must accompany them. Flow Control & Quality of I/O Service

8 HEP Fermilab and CMS The Compact Muon Solenoid (CMS)! $750M Experiment being built at CERN in Switzerland! Will be active in 2007! Data rate from the detectors is ~1 PB/sec! Data rate after filtering is ~hundreds of MB/sec The Data Problem! Dataset for a single experiment is ~1PB! Several experiments per year are run! Must be made available to 5000 scientists all over the planet (Earth primarily)! Dense dataset, sparse data access by any one user! Access patterns are not deterministic HEP experiments cost $US 1B, last 20 years, involve thousands of collaborators at hundreds of institutions worldwide, and collect and analyze several petabytes of data per year

9 LHC Data Grid Hierarchy CMS as example, Atlas is similar human=2m Tier 1 ~PByte/sec CMS detector: 15m X 15m X 22m 12,500 tons, $700M. Online System Tier 0 +1 ~2.5 Gbits/sec ~100 MBytes/sec event reconstruction event simulation French Regional Center German Regional Center Italian Center FermiLab, USA Regional Center Tier 3 Physics data cache Courtesy Harvey Newman, CalTech and CERN analysis Institute Institute ~0.25TIPS Institute Mbits/sec Tier 4 Workstations Institute ~ Gbps Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier 2 ~ Gbps CERN/CMS data goes to 6-8 Tier 1 regional centers, and from each of these to 6-10 Tier 2 centers. Physicists work on analysis channels at 135 institutes. Each institute has ~10 physicists working on one or more channels physicists in 31 countries are involved in this 20-year experiment in which DOE is a major player.

10 NASA EOSDIS Remote Data Store Project:! Build a 6PB Data archive with a life expectancy of at least 20 years, probably more! Make data and data products available to 2 million users What to use?! Online versus Nearline! SCSI vs ATA! Tape vs Optical! How much of each and when? Data Grids? Dealing with Technology Life Cycles continual migration

11 DoD NSA How to deal with a trillion files?! At 256 bytes of metadata per file -> 256TB just for the file system metadata for one trillion files! File System resiliency! Backups? Forget it. File Creation Rate is a challenge 32,000 file per second for 1 year will generate 1 trillion files How to search for any given file How to search for any given piece of information inside all the files

12 DoD MSRC 500TB per year data growth Longevity of data retention is critical! 100% reliable access of any piece of data for 20+ years Security is critical Reasonably quick access to any piece of data from anywhere at any time Heterogeneous computing and storage environment

13 History has shown The problems that the Lunatic Fringe is working on today are the problems that the main-stream storage industry will face in 5-10 years Legacy Block-based File Systems break at these scales Legacy Network File System protocols cannot scale to meet these extreme requirements

14 Looking Forward

15 What happens when. NEC Announces a 10Tbit Memory Chip Disk drives reach 1TByte and beyond MEMS devices become commercially viable Holographic Storage Devices become commercially viable Interface speeds reach 1Tbit/sec Intel develops the sub-space channel Vendors need better ways to exploit the capabilities of these technologies rather than react to them

16 Common thread Their data storage capacity, access, and retention requirements are continually increasing Some of the technologies and concepts the Lunatic Fringe are looking at include:! Object-based Storage Device! Intelligent Storage! Data Grid! Borg Assimilation Technologies, etc.

17 How does SNIA make a difference? Act as a point to achieve critical mass behind emerging technologies such as OSD, SMI, and Intelligent Storage Make sure that these emerging technologies come to market from the beginning as standards (not proprietary implementations that migrate to standards) Help to get beyond the potential barrier for emerging technologies OSD and Intelligent Storage Help to generate vendor and user awareness and education regarding future trends and emerging technologies

18 Conclusions Lunatic Fringe users will continue to push the limits of existing hardware and software technologies Lunatic Fringe is a moving target there will always be a Lunatic Fringe well beyond where you are The Storage Industry at large should pay more attention to! What they are doing! Why they are doing it! What they learn

19 References University of Minnesota Digital Technology Center ASCI Fermilab NASA EOSDIS NSA

20 Contact Info

Storage on the Lunatic Fringe

Storage on the Lunatic Fringe Storage on the Lunatic Fringe Thomas M. Ruwart Chief [Mad] Scientist tmruwart@atrato.com SNIA Developers Conference San Jose, CA September 25, 2008 Why you are here To learn that there are organizations

More information

Commodity Reliability And Practices or Building Reliable Systems with CRAP

Commodity Reliability And Practices or Building Reliable Systems with CRAP Commodity Reliability And Practices or Building Reliable Systems with CRAP Thomas M. Ruwart Chief Scientist tomr@sherwoodinfo.com University of Minnesota Digital Technology Center Minneapolis, MN October

More information

Data Transfers Between LHC Grid Sites Dorian Kcira

Data Transfers Between LHC Grid Sites Dorian Kcira Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference

More information

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Compact Muon Solenoid: Cyberinfrastructure Solutions Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Computing Demands CMS must provide computing to handle huge data rates and sizes, and

More information

Table 9. ASCI Data Storage Requirements

Table 9. ASCI Data Storage Requirements Table 9. ASCI Data Storage Requirements 1998 1999 2000 2001 2002 2003 2004 ASCI memory (TB) Storage Growth / Year (PB) Total Storage Capacity (PB) Single File Xfr Rate (GB/sec).44 4 1.5 4.5 8.9 15. 8 28

More information

The CMS Computing Model

The CMS Computing Model The CMS Computing Model Dorian Kcira California Institute of Technology SuperComputing 2009 November 14-20 2009, Portland, OR CERN s Large Hadron Collider 5000+ Physicists/Engineers 300+ Institutes 70+

More information

UW-ATLAS Experiences with Condor

UW-ATLAS Experiences with Condor UW-ATLAS Experiences with Condor M.Chen, A. Leung, B.Mellado Sau Lan Wu and N.Xu Paradyn / Condor Week, Madison, 05/01/08 Outline Our first success story with Condor - ATLAS production in 2004~2005. CRONUS

More information

Gigabyte Bandwidth Enables Global Co-Laboratories

Gigabyte Bandwidth Enables Global Co-Laboratories Gigabyte Bandwidth Enables Global Co-Laboratories Prof. Harvey Newman, Caltech Jim Gray, Microsoft Presented at Windows Hardware Engineering Conference Seattle, WA, 2 May 2004 Credits: This represents

More information

Towards Network Awareness in LHC Computing

Towards Network Awareness in LHC Computing Towards Network Awareness in LHC Computing CMS ALICE CERN Atlas LHCb LHC Run1: Discovery of a New Boson LHC Run2: Beyond the Standard Model Gateway to a New Era Artur Barczyk / Caltech Internet2 Technology

More information

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products What is big data? Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

More information

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Distributed File Systems Part IV. Hierarchical Mass Storage Systems Distributed File Systems Part IV Daniel A. Menascé Hierarchical Mass Storage Systems On-line data requirements Mass Storage Systems Concepts Mass storage system architectures Example systems Performance

More information

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center Beyond Petascale Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center GPFS Research and Development! GPFS product originated at IBM Almaden Research Laboratory! Research continues to

More information

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010 Worldwide Production Distributed Data Management at the LHC Brian Bockelman MSST 2010, 4 May 2010 At the LHC http://op-webtools.web.cern.ch/opwebtools/vistar/vistars.php?usr=lhc1 Gratuitous detector pictures:

More information

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

NetApp: Solving I/O Challenges. Jeff Baxter February 2013 NetApp: Solving I/O Challenges Jeff Baxter February 2013 1 High Performance Computing Challenges Computing Centers Challenge of New Science Performance Efficiency directly impacts achievable science Power

More information

Data Movement & Storage Using the Data Capacitor Filesystem

Data Movement & Storage Using the Data Capacitor Filesystem Data Movement & Storage Using the Data Capacitor Filesystem Justin Miller jupmille@indiana.edu http://pti.iu.edu/dc Big Data for Science Workshop July 2010 Challenges for DISC Keynote by Alex Szalay identified

More information

High-Energy Physics Data-Storage Challenges

High-Energy Physics Data-Storage Challenges High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)

More information

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2

More information

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan Management Information Systems Information Systems: Computer Hardware Dr. Shankar Sundaresan (Adapted from Introduction to IS, Rainer and Turban) OUTLINE Introduction The Central Processing Unit Computer

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information

Deduplication Storage System

Deduplication Storage System Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Terabytes, Petabytes and Beyond -- Data Storage Strategies

Terabytes, Petabytes and Beyond -- Data Storage Strategies Terabytes, Petabytes and Beyond -- Data Storage Strategies Mike Leonhardt Storage Technology Corporation 2270 S. 88th St. Louisville, CO 80028-6100 phone:+1-303-673-5627; fax:+1-303-673-7967 e-mail: michael_leonhardt@storagetek.com

More information

Coordinating Parallel HSM in Object-based Cluster Filesystems

Coordinating Parallel HSM in Object-based Cluster Filesystems Coordinating Parallel HSM in Object-based Cluster Filesystems Dingshan He, Xianbo Zhang, David Du University of Minnesota Gary Grider Los Alamos National Lab Agenda Motivations Parallel archiving/retrieving

More information

Grid Computing at the IIHE

Grid Computing at the IIHE BNC 2016 Grid Computing at the IIHE The Interuniversity Institute for High Energies S. Amary, F. Blekman, A. Boukil, O. Devroede, S. Gérard, A. Ouchene, R. Rougny, S. Rugovac, P. Vanlaer, R. Vandenbroucke

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

Virtualizing a Batch. University Grid Center

Virtualizing a Batch. University Grid Center Virtualizing a Batch Queuing System at a University Grid Center Volker Büge (1,2), Yves Kemp (1), Günter Quast (1), Oliver Oberst (1), Marcel Kunze (2) (1) University of Karlsruhe (2) Forschungszentrum

More information

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic Deep Storage for Exponential Data Nathan Thompson CEO, Spectra Logic HISTORY Partnered with Fujifilm on a variety of projects HQ in Boulder, 35 years of business Customers in 54 countries Spectra builds

More information

Today: Secondary Storage! Typical Disk Parameters!

Today: Secondary Storage! Typical Disk Parameters! Today: Secondary Storage! To read or write a disk block: Seek: (latency) position head over a track/cylinder. The seek time depends on how fast the hardware moves the arm. Rotational delay: (latency) time

More information

Integrating Fibre Channel Storage Devices into the NCAR MSS

Integrating Fibre Channel Storage Devices into the NCAR MSS Integrating Fibre Channel Storage Devices into the NCAR MSS John Merrill National Center for Atmospheric Research 1850 Table Mesa Dr., Boulder, CO, 80305-5602 Phone:+1-303-497-1273 FAX: +1-303-497-1848

More information

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties Netherlands Institute for Radio Astronomy Update LOFAR Long Term Archive May 18th, 2009 Hanno Holties LOFAR Long Term Archive (LTA) Update Status Architecture Data Management Integration LOFAR, Target,

More information

Storage Industry Resource Domain Model

Storage Industry Resource Domain Model Storage Industry Resource Domain Model A Technical Proposal from the SNIA Technical Council Topics Abstract Data Storage Interfaces Storage Resource Domain Data Resource Domain Information Resource Domain

More information

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy Texas A&M Big Data Workshop October 2011 January 2015, Texas A&M University Research Topics Seminar 1 Outline Overview of

More information

2 Databases for calibration and bookkeeping purposes

2 Databases for calibration and bookkeeping purposes Databases for High Energy Physics D. Baden University of Maryland. B. Linder ORACLE corporation R. Mount Califomia Institute of Technology J. Shiers CERN, Geneva, Switzerland. This paper will examine the

More information

Storage for HPC, HPDA and Machine Learning (ML)

Storage for HPC, HPDA and Machine Learning (ML) for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by

More information

CHIPP Phoenix Cluster Inauguration

CHIPP Phoenix Cluster Inauguration TheComputing Environment for LHC Data Analysis The LHC Computing Grid CHIPP Phoenix Cluster Inauguration Manno, Switzerland 30 May 2008 Les Robertson IT Department - CERN CH-1211 Genève 23 les.robertson@cern.ch

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

HPSS Treefrog Summary MARCH 1, 2018

HPSS Treefrog Summary MARCH 1, 2018 HPSS Treefrog Summary MARCH 1, 2018 Disclaimer Forward looking information including schedules and future software reflect current planning that may change and should not be taken as commitments by IBM

More information

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER V.V. Korenkov 1, N.A. Kutovskiy 1, N.A. Balashov 1, V.T. Dimitrov 2,a, R.D. Hristova 2, K.T. Kouzmov 2, S.T. Hristov 3 1 Laboratory of Information

More information

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland The LCG 3D Project Maria Girone, CERN The rd Open Grid Forum - OGF 4th June 2008, Barcelona Outline Introduction The Distributed Database (3D) Project Streams Replication Technology and Performance Availability

More information

Storage Resource Sharing with CASTOR.

Storage Resource Sharing with CASTOR. Storage Resource Sharing with CASTOR Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov (IHEP) ben.couturier@cern.ch 16/4/2004 Storage Resource Sharing

More information

LHC and LSST Use Cases

LHC and LSST Use Cases LHC and LSST Use Cases Depots Network 0 100 200 300 A B C Paul Sheldon & Alan Tackett Vanderbilt University LHC Data Movement and Placement n Model must evolve n Was: Hierarchical, strategic pre- placement

More information

File Storage Management Systems (FSMS) and ANSI/AIIM MS66

File Storage Management Systems (FSMS) and ANSI/AIIM MS66 File Storage Management Systems (FSMS) and ANSI/AIIM MS66 Joel Williams Systems Engineering and Security, Inc 7474 Greenway Center Dr, Suite 700 Greenbelt MD 20770-3523 Phone: +1-301-441-3694, FAX: +1-301-441-3697

More information

IEPSAS-Kosice: experiences in running LCG site

IEPSAS-Kosice: experiences in running LCG site IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of

More information

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Architecting Storage for Semiconductor Design: Manufacturing Preparation White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008 CERN openlab II CERN openlab and Intel: Today and Tomorrow Sverre Jarp CERN openlab CTO 16 September 2008 Overview of CERN 2 CERN is the world's largest particle physics centre What is CERN? Particle physics

More information

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center (EB) 140 120

More information

A scalable storage element and its usage in HEP

A scalable storage element and its usage in HEP AstroGrid D Meeting at MPE 14 15. November 2006 Garching dcache A scalable storage element and its usage in HEP Martin Radicke Patrick Fuhrmann Introduction to dcache 2 Project overview joint venture between

More information

1. Introduction. Outline

1. Introduction. Outline Outline 1. Introduction ALICE computing in Run-1 and Run-2 2. ALICE computing in Run-3 and Run-4 (2021-) 3. Current ALICE O 2 project status 4. T2 site(s) in Japan and network 5. Summary 2 Quark- Gluon

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

CERN s Business Computing

CERN s Business Computing CERN s Business Computing Where Accelerated the infinitely by Large Pentaho Meets the Infinitely small Jan Janke Deputy Group Leader CERN Administrative Information Systems Group CERN World s Leading Particle

More information

A GPFS Primer October 2005

A GPFS Primer October 2005 A Primer October 2005 Overview This paper describes (General Parallel File System) Version 2, Release 3 for AIX 5L and Linux. It provides an overview of key concepts which should be understood by those

More information

Data services for LHC computing

Data services for LHC computing Data services for LHC computing SLAC 1 Xavier Espinal on behalf of IT/ST DAQ to CC 8GB/s+4xReco Hot files Reliable Fast Processing DAQ Feedback loop WAN aware Tier-1/2 replica, multi-site High throughout

More information

CC-IN2P3: A High Performance Data Center for Research

CC-IN2P3: A High Performance Data Center for Research April 15 th, 2011 CC-IN2P3: A High Performance Data Center for Research Toward a partnership with DELL Dominique Boutigny Agenda Welcome Introduction to CC-IN2P3 Visit of the computer room Lunch Discussion

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

CERN Open Data and Data Analysis Knowledge Preservation

CERN Open Data and Data Analysis Knowledge Preservation CERN Open Data and Data Analysis Knowledge Preservation Tibor Šimko Digital Library 2015 21 23 April 2015 Jasná, Slovakia @tiborsimko @inveniosoftware 1 / 26 1 Invenio @tiborsimko @inveniosoftware 2 /

More information

Data storage services at KEK/CRC -- status and plan

Data storage services at KEK/CRC -- status and plan Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC October 28, 2013 Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration 朱义普 Director, North Asia, HPC DDN Storage Vendor for HPC & Big Data

More information

Data Intensive Science Impact on Networks

Data Intensive Science Impact on Networks Data Intensive Science Impact on Networks Eli Dart, Network Engineer ESnet Network Engineering g Group IEEE Bandwidth Assessment Ad Hoc December 13, 2011 Outline Data intensive science examples Collaboration

More information

PetaSTAR A Real World Data Storage and Management Solution

PetaSTAR A Real World Data Storage and Management Solution PetaSTAR A Real World Data Storage and Management Solution James M. Johnson Ovation Data Services, Inc. 10650 Haddington Drive Houston TX 77043-3229 Phone: +1-713-464-1300 FAX: +1-713-464-1615 e-mail:

More information

Long Term Data Preservation for CDF at INFN-CNAF

Long Term Data Preservation for CDF at INFN-CNAF Long Term Data Preservation for CDF at INFN-CNAF S. Amerio 1, L. Chiarelli 2, L. dell Agnello 3, D. De Girolamo 3, D. Gregori 3, M. Pezzi 3, A. Prosperini 3, P. Ricci 3, F. Rosso 3, and S. Zani 3 1 University

More information

Storage Systems Market Analysis Dec 04

Storage Systems Market Analysis Dec 04 Storage Systems Market Analysis Dec 04 Storage Market & Technologies World Wide Disk Storage Systems Market Analysis Wor ldwi d e D i s k Storage S y s tems Revenu e b y Sup p l i e r, 2001-2003 2001

More information

WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS. Industry-leading IP Performance for 4K, 8K and Beyond

WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS. Industry-leading IP Performance for 4K, 8K and Beyond WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS Industry-leading IP Performance for 4K, 8K and Beyond CONTENTS Introduction... 3 Audience... 3 How Traditional Infrastructure is Failing in Creative Workflows...

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

IBM Storwize V7000 Unified

IBM Storwize V7000 Unified IBM Storwize V7000 Unified Pavel Müller IBM Systems and Technology Group Storwize V7000 Position Enterprise Block DS8000 For clients requiring: Advanced disaster recovery with 3-way mirroring and System

More information

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge HPC Growing Pains IT Lessons Learned from the Biomedical Data Deluge John L. Wofford Center for Computational Biology & Bioinformatics Columbia University What is? Internationally recognized biomedical

More information

Kinetic Open Storage Platform: Enabling Break-through Economics in Scale-out Object Storage PRESENTATION TITLE GOES HERE Ali Fenn & James Hughes

Kinetic Open Storage Platform: Enabling Break-through Economics in Scale-out Object Storage PRESENTATION TITLE GOES HERE Ali Fenn & James Hughes Kinetic Open Storage Platform: Enabling Break-through Economics in Scale-out Object Storage PRESENTATION TITLE GOES HERE Ali Fenn & James Hughes Seagate Technology 2020: 7.3 Zettabytes 56% of total = in

More information

NetApp High-Performance Storage Solution for Lustre

NetApp High-Performance Storage Solution for Lustre Technical Report NetApp High-Performance Storage Solution for Lustre Solution Design Narjit Chadha, NetApp October 2014 TR-4345-DESIGN Abstract The NetApp High-Performance Storage Solution (HPSS) for Lustre,

More information

Big Data Analytics and the LHC

Big Data Analytics and the LHC Big Data Analytics and the LHC Maria Girone CERN openlab CTO Computing Frontiers 2016, Como, May 2016 DOI: 10.5281/zenodo.45449, CC-BY-SA, images courtesy of CERN 2 3 xx 4 Big bang in the laboratory We

More information

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5 TECHNOLOGY BRIEF Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5 ABSTRACT Xcellis represents the culmination of over 15 years of file system and data management

More information

Summary of the LHC Computing Review

Summary of the LHC Computing Review Summary of the LHC Computing Review http://lhc-computing-review-public.web.cern.ch John Harvey CERN/EP May 10 th, 2001 LHCb Collaboration Meeting The Scale Data taking rate : 50,100, 200 Hz (ALICE, ATLAS-CMS,

More information

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1 ATLAS Tier-2 Computing in D GridKa-TAB, Karlsruhe, 30.9.2005 München Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1 GridKa-TAB, 30.9.05 1 ATLAS Offline Computing ~Pb/sec PC (2004)

More information

Data Analysis in Experimental Particle Physics

Data Analysis in Experimental Particle Physics Data Analysis in Experimental Particle Physics C. Javier Solano S. Grupo de Física Fundamental Facultad de Ciencias Universidad Nacional de Ingeniería Data Analysis in Particle Physics Outline of Lecture

More information

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads Liran Zvibel CEO, Co-founder WekaIO @liranzvibel 1 WekaIO Matrix: Full-featured and Flexible Public or Private S3 Compatible

More information

Building the Storage Internet. Dispersed Storage Overview March 2008

Building the Storage Internet. Dispersed Storage Overview March 2008 Building the Storage Internet Dispersed Storage Overview March 2008 Project Overview An Open Source Project with Commercial Backing Dispersed Storage an Open Source Project Hosted at www.cleversafe.org

More information

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates Survey: Users Share Their Storage Performance Needs Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates Table of Contents The Problem... 1 Application Classes... 1 IOPS Needs... 2 Capacity

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

EMC ISILON HARDWARE PLATFORM

EMC ISILON HARDWARE PLATFORM EMC ISILON HARDWARE PLATFORM Three flexible product lines that can be combined in a single file system tailored to specific business needs. S-SERIES Purpose-built for highly transactional & IOPSintensive

More information

Andrea Sciabà CERN, Switzerland

Andrea Sciabà CERN, Switzerland Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start

More information

<Insert Picture Here> Tape Technologies April 4, 2011

<Insert Picture Here> Tape Technologies April 4, 2011 Tape Technologies April 4, 2011 Gary Francis Sr. Director, Storage Welcome to PASIG 2010 Oracle and/or its affiliates. All rights reserved. Oracle confidential 2 Perhaps you have

More information

Backup and archiving need not to create headaches new pain relievers are around

Backup and archiving need not to create headaches new pain relievers are around Backup and archiving need not to create headaches new pain relievers are around Frank Reichart Senior Director Product Marketing Storage Copyright 2012 FUJITSU Hot Spots in Data Protection 1 Copyright

More information

Scale-out Data Deduplication Architecture

Scale-out Data Deduplication Architecture Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture

More information

COSC6376 Cloud Computing Lecture 17: Storage Systems

COSC6376 Cloud Computing Lecture 17: Storage Systems COSC6376 Cloud Computing Lecture 17: Storage Systems Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Storage Area Network and Storage Virtualization Single Disk Drive

More information

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006 GRID COMPUTING ACTIVITIES AT BARC ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006 Computing Grid at BARC Computing Grid system has been set up as a Test-Bed using existing Grid Technology

More information

IBM Storage. Leading the 21st Century Growth. Freddy Lee Advanced Technical Support

IBM Storage. Leading the 21st Century Growth. Freddy Lee Advanced Technical Support IBM Storage Leading the 21st Century Growth Freddy Lee Advanced Technical Support Why IBM System Storage? Heritage of Distinction 50+ years in storage business Innovation leadership - Invented many of

More information

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era to meet the computing requirements of the HL-LHC era NPI AS CR Prague/Rez E-mail: adamova@ujf.cas.cz Maarten Litmaath CERN E-mail: Maarten.Litmaath@cern.ch The performance of the Large Hadron Collider

More information

T e c h n o l o g y. LaserTAPE: The Future of Storage

T e c h n o l o g y. LaserTAPE: The Future of Storage LOTSt m LaserTAPE: The Future of Storage Kenneth Samarra LOTS Technology 1751 S Fordham St, Longmont CO 80503-7556 Phone: +1-720-652-4527 FAX: +1-303-651-6373 E-mail: ken.samarra@lotstech.com Presented

More information

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write

More information

Shared Object-Based Storage and the HPC Data Center

Shared Object-Based Storage and the HPC Data Center Shared Object-Based Storage and the HPC Data Center Jim Glidewell High Performance Computing BOEING is a trademark of Boeing Management Company. Computing Environment Cray X1 2 Chassis, 128 MSPs, 1TB memory

More information

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s

More information

Introducing Panasas ActiveStor 14

Introducing Panasas ActiveStor 14 Introducing Panasas ActiveStor 14 SUPERIOR PERFORMANCE FOR MIXED FILE SIZE ENVIRONMENTS DEREK BURKE, PANASAS EUROPE INTRODUCTION TO PANASAS Storage that accelerates the world s highest performance and

More information

A 101 Guide to Heterogeneous, Accelerated, Data Centric Computing Architectures

A 101 Guide to Heterogeneous, Accelerated, Data Centric Computing Architectures A 101 Guide to Heterogeneous, Accelerated, Centric Computing Architectures Allan Cantle President & Founder, Nallatech Join the Conversation #OpenPOWERSummit 2016 OpenPOWER Foundation Buzzword & Acronym

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

Planning For Persistent Memory In The Data Center. Sarah Jelinek/Intel Corporation

Planning For Persistent Memory In The Data Center. Sarah Jelinek/Intel Corporation Planning For Persistent Memory In The Data Center Sarah Jelinek/Intel Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies

More information

I/O Challenges: Todays I/O Challenges for Big Data Analysis. Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013

I/O Challenges: Todays I/O Challenges for Big Data Analysis. Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013 I/O Challenges: Todays I/O Challenges for Big Data Analysis Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013 The Challenge is Archives Big data in HPC means archive and archive translates to a tape

More information

Study of the viability of a Green Storage for the ALICE-T1. Eduardo Murrieta Técnico Académico: ICN - UNAM

Study of the viability of a Green Storage for the ALICE-T1. Eduardo Murrieta Técnico Académico: ICN - UNAM Study of the viability of a Green Storage for the ALICE-T1 Eduardo Murrieta Técnico Académico: ICN - UNAM Objective To perform a technical analysis of the viability to replace a Tape Library for a Disk

More information