Data Movement & Storage Using the Data Capacitor Filesystem

Similar documents
Indiana University s Lustre WAN: The TeraGrid and Beyond

Indiana University's Lustre WAN: Empowering Production Workflows on the TeraGrid

Regional & National HPC resources available to UCSB

Emerging Technologies for HPC Storage

LCE: Lustre at CEA. Stéphane Thiell CEA/DAM

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

High-Performance Lustre with Maximum Data Assurance

Lustre / ZFS at Indiana University

Parallel File Systems. John White Lawrence Berkeley National Lab

Parallel File Systems Compared

HPC Storage Use Cases & Future Trends

Architecting Storage for Semiconductor Design: Manufacturing Preparation

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

Parallel File Systems for HPC

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Data Movement and Storage. 04/07/09 1

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Lustre overview and roadmap to Exascale computing

TGCC OVERVIEW. 13 février 2014 CEA 10 AVRIL 2012 PAGE 1

NetApp High-Performance Storage Solution for Lustre

LUG 2012 From Lustre 2.1 to Lustre HSM IFERC (Rokkasho, Japan)

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC

Lustre HSM at Cambridge. Early user experience using Intel Lemur HSM agent

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

HPC Capabilities at Research Intensive Universities

Coordinating Parallel HSM in Object-based Cluster Filesystems

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Using file systems at HC3

Blue Waters I/O Performance

An Overview of Fujitsu s Lustre Based File System

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Open SFS Roadmap. Presented by David Dillow TWG Co-Chair

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

Data Movement & Tiering with DMF 7

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

IBM Storwize V7000 Unified

Building Self-Healing Mass Storage Arrays. for Large Cluster Systems

INFOBrief. Dell-IBRIX Cluster File System Solution. Key Points

Using DDN IME for Harmonie

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

Experiences with HP SFS / Lustre in HPC Production

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Overview of HPC at LONI

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

Overview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

Lustre* is designed to achieve the maximum performance and scalability for POSIX applications that need outstanding streamed I/O.

Lustre at the OLCF: Experiences and Path Forward. Galen M. Shipman Group Leader Technology Integration

Mission-Critical Lustre at Santos. Adam Fox, Lustre User Group 2016

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Academic Workflow for Research Repositories Using irods and Object Storage

Three Generations of Linux Dr. Alexander Dunaevskiy

Xyratex ClusterStor6000 & OneStor

Filesystems on SSCK's HP XC6000

DDN About Us Solving Large Enterprise and Web Scale Challenges

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

BlueGene/L. Computer Science, University of Warwick. Source: IBM

DELL Terascala HPC Storage Solution (DT-HSS2)

I/O Challenges: Todays I/O Challenges for Big Data Analysis. Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013

Datura The new HPC-Plant at Albert Einstein Institute

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

CERN Lustre Evaluation

Efficient Object Storage Journaling in a Distributed Parallel File System

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

High Performance Storage Solutions

I/O at the Center for Information Services and High Performance Computing

Application Performance on IME

Lustre on ZFS. At The University of Wisconsin Space Science and Engineering Center. Scott Nolin September 17, 2013

HPC Hardware Overview

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project

Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems?

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

Dell TM Terascala HPC Storage Solution

Data storage services at KEK/CRC -- status and plan

The Stampede is Coming: A New Petascale Resource for the Open Science Community

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

Isilon Performance. Name

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

Cluster Setup and Distributed File System

Lustre TM. Scalability

Introduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Transcription:

Data Movement & Storage Using the Data Capacitor Filesystem Justin Miller jupmille@indiana.edu http://pti.iu.edu/dc Big Data for Science Workshop July 2010

Challenges for DISC Keynote by Alex Szalay identified the challenges that researchers face Scientific data doubles every year Amount of data is a barrier to extracting knowledge Problem of today: data access How can we minimize data movement?

Workflow Example Single Compute 0$/$)1+.%&" *+,-./")!"#+.%&"!"#"$%&'"%(#)*+,-./"%

Workflow Example Multiple Compute 3$/$)4+.%&" *+,-./"!"#+.%&")01 *+,-./"!"#+.%&")02!"#"$%&'"%(#)*+,-./"%

Workflow Example Visualization 3$/$)4+.%&" *+,-./")!"#+.%&")01!"#"$%&'"%(#) *+,-./"% *+,-./")!"#+.%&")02 56#.$768$/6+9!"#+.%&"

Workflow Example Archive 3$/$)4+.%&" *+,-./")!"#+.%&")01!"#"$%&'"%(#) *+,-./"% *+,-./")!"#+.%&")02 56#.$768$/6+9!"#+.%&" :$-" ;%&'6<"

Data Movement & Storage This is an unsustainable workflow Works for GB, maybe single TB, but not more Every resource is another series of transfers Data movement is in the way of doing work Good reasons to add resources to workflow And we haven t addressed other drawbacks

IU Central Filesystem Workflow 3$/$)4+.%&" *+,-./")!"#+.%&")01!"#"$%&'"%(#) *+,-./"%!"#" $"%"&'#() *+,-./")!"#+.%&")02 56#.$768$/6+9!"#+.%&" :$-" ;%&'6<"

IU s Data Capacitor Filesystem National Science Foundation funded in 2005 Funds purchased 535TB of Lustre storage 339TB available as production service Data Capacitor name comes from electronics capacitors provides transient storage of electrons absorbs and evens out peaks in flow provides consistent output

Idea of Data Capacitor Centralized short-term storage for IU resources Store your data to compute against, and use for scratch space during your run Possibility exists for mid-term storage

Data Capacitor Centralized Storage Compute using IU s supercomputer Big Red Compute using IU s Quarry cluster Archive to IU s massive HPSS tape archive hierarchical storage archive your data to tape

Central to IU Cyberinfrastructure

Physics Research Dr. Chuck Horowitz, IU physicist Interested in the behavior of neutron stars Studying the behavior of nuclear matter near saturation density can form interesting phase "nuclear pasta Using MDGRAPE-2 hardware for increased performance

Physics Research Particle interaction is simulated via molecular dynamics using specialized MDGRAPE-2 hardware configurations are saved Post processing creates VTK frames Visualization system ingests frame data displays as movie

Physics Research!"#$%&'( )'*"%+,' 7/&/!/$/,.&"+ -.*%/0.1/&."2 )'*"%+,' 3/$' 4+,5.6'

Earth Science Research Linked Environments for Atmospheric Discovery (LEAD) WxChallenge The WxChallenge is a meteorological forecast competition. Compete to forecaste maximum and minimum temperatures, precipitation, and maximum wind speeds for select U.S. cities over a tenweek period each semester

LEAD Workflow 0'.&1'+ -.&.!"#$%&'( )'*"%+,' -.&.!.$.,/&"+!"#$%&'+ 2,/'3,'!4%*&'+ 5+.3*6'+ )'*"%+,'*

Extend the Centralized FS Model The natural progression is to be central to more resources Make data available to more resources IU did this by extending the filesystem across the wide-area network (WAN) Data Capacitor WAN (DC-WAN) New FS separate from the original DC

Data Capacitor WAN

Data Capacitor WAN Tradeoffs The benefit of a centralized WAN filesystem is the illusion of locality Your data is transferred behind the scenes across the network At worst your data will be transferred slower than you like At best it is as fast, or faster, than local storage; typically comparable across research networks

DC-WAN Namespace Mapping WAN FS challenge is heterogeneous user identification across sites The numeric user identification (UID) for a particular user not the same across sites You don t have to worry about this because DC-WAN does the conversion Indiana TACC PSC NCSA SDSC jupmille tg803934 jupmille jupmille jupmille uid=648424 uid=803934 uid=43415 uid=40436 uid=502639

Physics Research with DC-WAN ;'+-$#.')"()" *)+,-./ 0/&)-12/!-&.'"<(5= >>?(+'$/& 3'&-#$'4#.')" 0/&)-12/ 8#.# *#,#2'.)1 9!:!"#$%&'&()" *)+,-./ 0/&)-12/ 5#,/!126'7/

Astronomy with DC-WAN 3-2&)"?(!@ ABCD(+'$/&!"#$%&'&()" *)+,-./ 0/&)-12/ 79:8(3/$/&2),/ ;"/(6/<1//(9+#</1 =;69> 6#.# *#,#2'.)1 7!8 3#,/!124'5/ Image NOAO/AURA/NSF

Center for the Remote Sensing of Ice Sheets (CReSIS) Workflow 5*''67-68!"#$%&' (')"%*+' 9-:*'6+';<=>?@A<#07') 2-&-!-$-+0&"* 3.4,-$'.*+/01'.6&-*&0+-

Gas Giant Planet Research G9- G),,2B3&H(<=G% FE?=A)4$2 1)23"4)5",).6 7$2.3&'$ 0-9% :&B"6"<=CD EF@=A)4$2 +"," -"#"'),.& /%0!"#$ %&'()*$ 89: 9,"&;*)44$<=89 >?@=A)4$2

Demo Small sample of Gas Giant Planet Research Data is on DC-WAN, which is mounted on two different resources Compute on PSC s Pople (SGI Altix 4700) Post-process and visualize results on IU machine that has proprietary software (IDL v7.0); view over network

IU s Data Capacitor WAN Filesystem Funded by Indiana University in 2008 339TB of storage available as production service Centralized short-term storage for nationwide resources, including TeraGrid Use your data on the best resource for your needs Short-term storage like DC, possibility exists for mid-term storage

Based on Lustre Filesystem Lustre is a parallel distributed file system Available under the GNU GPL Used by U.S. government, movie studios, financial institutions, oil and gas industry 7 of the top 10 HPC systems on the June 2009 "Top 500" list 52 of the top 100 run Lustre in 2010

Based on Lustre Filesystem Lustre filesystems can support up to tens of thousands of client systems, petabytes (PBs) of storage and hundreds of gigabytes per second (GB/s) of I/O throughput. Scalable filesystem uses separate servers to aggregate for performance storage backend is hidden from the client

Lustre Filesystem Architecture Lustre presents all clients with standard POSIX filesystem interface Filesystem mount My scratch directory for example: IU: /N/dcwan/scratch/jupmille/ PSC: /N/dcwan/scratch/jupmille/ TACC: /N/dcwan/scratch/jupmille/ NCSA: /N/dcwan/scratch/jupmille/ Standard commands ls, cp, cat, etc. from the command line

Lustre Filesystem Architecture Metadata Server (MDS) stores the filesystem metadata such as filenames, directories, and permissions. file operations such as open/close Object Storage Server (OSS) bulk I/O servers Object Storage Targets (OST) back-end storage devices

Lustre Filesystem Architecture '() *)) *)) *)) *))!"#$%& *)+ *)+ *)+ *)+ *)+ *)+ *)+ *)+ *)+ *)+ *)+

Data Capacitor Hardware 8 pairs Dell PowerEdge 2950 2 x 3.0 GHz Dual Core Xeon Myrinet 10G Ethernet Dual port Qlogic 2432 HBA (4 x FC) 2.6 Kernel (RHEL 5), Lustre 1.8 4 DDN S2A9550 Controllers Over 2.4 GB/sec measured throughput each 339Tb of spinning SATA disk

Data Capacitor WAN Hardware 2 pairs Dell PowerEdge 2950 2 x 3.0 GHz Dual Core Xeon Myrinet 10G Ethernet Dual port Qlogic 2432 HBA (4 x FC) 2.6 Kernel (RHEL 5), Lustre 1.8 1 DDN S2A9550 Controllers Over 2.4 GB/sec measured throughput 339Tb of spinning SATA disk

Getting the Most out of Lustre Lustre is optimized for large files (where large is >1Mb), not so good for small files Lustre has aggressive client side caching if you plan reading the same files more than once, big win Lustre allows you to control how your data is striped across the OSTs, so optimization based on your I/O patterns can reap benefits in throughput

Lustre WAN Future DC-WAN will be mounted on the India and Sierra FutureGrid clusters In the testing phase right now IU s Lustre UID mapping code will be used in a new TeraGrid Lustre-WAN project in development now

Thank you for listening. Questions are welcome. Please use moderators for Q&A Justin Miller jupmille@indiana.edu Data Capacitor Team dc-team-l@indiana.edu http://pti.iu.edu/dc