Indiana University s Lustre WAN: The TeraGrid and Beyond

Similar documents
Indiana University's Lustre WAN: Empowering Production Workflows on the TeraGrid

Data Movement & Storage Using the Data Capacitor Filesystem

Overview of HPC at LONI

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Regional & National HPC resources available to UCSB

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC

Lustre / ZFS at Indiana University

HPC Capabilities at Research Intensive Universities

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

TeraGrid TeraGrid and the Path to Petascale

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Kerberized Lustre 2.0 over the WAN

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Steven Carter. Network Lead, NCCS Oak Ridge National Laboratory OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 1

Parallel File Systems Compared

Ian Foster, CS554: Data-Intensive Computing

Shaking-and-Baking on a Grid

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

The Center for Computational Research & Grid Computing

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

LBRN - HPC systems : CCT, LSU

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

CISL Update. 29 April Operations and Services Division

HPC Storage Use Cases & Future Trends

Parallel File Systems for HPC

Dr. John Dennis

Emerging Technologies for HPC Storage

Overview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011

Oak Ridge National Laboratory

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Storage Supporting DOE Science

Comprehensive Lustre I/O Tracing with Vampir

LHC and LSST Use Cases

Transitioning NCAR MSS to HPSS

Ian Foster, An Overview of Distributed Systems

LCE: Lustre at CEA. Stéphane Thiell CEA/DAM

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

High-Performance Lustre with Maximum Data Assurance

Video, Access & Switching Solutions!

Analytics of Wide-Area Lustre Throughput Using LNet Routers

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing

HPC Hardware Overview

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

Production Grids. Outline

The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

Experiences with HP SFS / Lustre in HPC Production

Cyberinfrastructure!

Video, Access & Switching Solutions!

High Throughput Urgent Computing

Blue Waters I/O Performance

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Getting Started with XSEDE. Dan Stanzione

I/O at the Center for Information Services and High Performance Computing

Isilon Performance. Name

TeraGrid: Past, Present, and Future. Phil Andrews, Director, National Institute for Computational Sciences, Oak Ridge, Tennessee

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Dell/EMC CX3 Series Oracle RAC 10g Reference Architecture Guide

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD

Remote & Collaborative Visualization. Texas Advanced Computing Center

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

Deploying the TeraGrid PKI

NetApp High-Performance Storage Solution for Lustre

An Overview of CSNY, the Cyberinstitute of the State of New York at buffalo

Organizational Update: December 2015

Pinnacle 3 SmartEnterprise

vstart 50 VMware vsphere Solution Specification

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS)

Demartek September Intel 10GbE Adapter Performance Evaluation for FCoE and iscsi. Introduction. Evaluation Environment. Evaluation Summary

The Spider Center Wide File System

LONI Update. ULS CIO Meeting. Lonnie Leger, LONI / LSU

HPE Scalable Storage with Intel Enterprise Edition for Lustre*

LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract

ALCF Argonne Leadership Computing Facility

Cluster Setup and Distributed File System

GPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

The Data exacell DXC. J. Ray Scott DXC PI May 17, 2016

Programmable Information Highway (with no Traffic Jams)

Introduction, History

Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000

Microsoft SharePoint Server 2010 on Dell Systems

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen

The GUPFS Project at NERSC Greg Butler, Rei Lee, Michael Welcome. NERSC Lawrence Berkeley National Laboratory One Cyclotron Road Berkeley CA USA

Clouds: An Opportunity for Scientific Applications?

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Datura The new HPC-Plant at Albert Einstein Institute

Arguably one of the most fundamental discipline that touches all other disciplines and people

BC31: A Case Study in the Battle of Storage Management. Scott Kindred, CBCP esentio Technologies

The Spider Center-Wide File System

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Efficient Object Storage Journaling in a Distributed Parallel File System

Transcription:

Indiana University s Lustre WAN: The TeraGrid and Beyond Stephen C. Simms Manager, Data Capacitor Project TeraGrid Site Lead, Indiana University ssimms@indiana.edu Lustre User Group Meeting April 17, 2009 Sausalito, California

The Data Capacitor Project NSF Funded in 2005 535 Terabytes Lustre storage 14.5 GB/s aggregate write Short term storage

What s A Data Capacitor? 12 pairs Dell PowerEdge 2950 2 x 3.0 GHz Dual Core Xeon Myrinet 10G Ethernet Dual port Qlogic 2432 HBA (4 x FC) 2.6 Kernel (RHEL 4) 6 DDN S2A9550 Controllers Over 2.4 GB/sec measured throughput each 535 Terabytes of spinning SATA disk

Maximize Utility Through One Stop Shopping Data Capacitor Sits at the center of IU s cyberinfrastructure Opens the door to new workflow opportunities Provides the user with: Fast paths to archive storage A shared filesystem for Computation Visualization Instruments Potentially desktops

LEAD Workflow Example

Lustre Across the WAN IU Client Server 17 ms latency PSC Metadata Server Object Storage Servers Network

10 GigE Single Client Tests 977 MB/s between ORNL and IU Using 10Gb TeraGrid connection Identical Dell 2950 clients 2x dual 3.0 GHz Xeon 4GB RAM 1 Myricom Myri10G card Ethernet mode Lustre 1.4.7.1 16 of 24 Data Capacitor (DC) OSSs

2007 Bandwidth Challenge: Five Applications Simultaneously Acquisition and Visualization Live Instrument Data Chemistry Rare Archival Material Humanities Acquisition, Analysis, and Visualization Trace Data Computer Science Simulation Data Life Science High Energy Physics

Challenge Results

Beyond a Demo

Florida Lambda Rail In 2008 UFL FIT FIU University of Florida Florida Institute of Technology Florida International University

IU UID Mapping Lightweight Not everyone needs / wants kerberos Not everyone needs / wants encryption Only change MDS code Want to maximize clients we can serve Simple enough to port forward

UID Mapping Lookup calls pluggable kernel module Binary tree stored in memory Based on NID or NID range Remote UID mapped to Effective UID Other lookup schemes are possible

Announced One Year Ago IU Lustre WAN Service Dedicated system of 360 TB for research Serve on a per project basis Single DDN S2A9950 Controller 6 Dell 2950 Servers 4 Object Storage Servers 1 pair of Metadata Servers for failover

EVIA University of Michigan University of Michigan Uncompressed Digital Video IU HPSS SAMBA IU Windows Tools

CReSIS University of Kansas Greenland Ice Sheet Data From University of Kansas IU HPSS IU s Quarry Supercomputer

The TeraGrid TeraGrid is an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource. ANL, IU, Louisiana Optical Network Initiative (LONI), NCSA, NICS, ORNL, PSC, Purdue, SDSC,TACC, NCAR

The TeraGrid Map NCAR UW UC/ANL Grid Infrastructure Group (UChicago) PU PSC USC/ISI Caltech SDSC NCSA IU ORNL Tennessee UNC/RENCI Resource Provider (RP) Software Integration Partner TACC LONI/LSU Network Hub 18

Lustre on the TeraGrid IU LONI NCSA NICS PSC TACC

IU s Lustre WAN on TeraGrid Production mount at PSC Altix 4700 and Login Nodes GridFTP and Gatekeeper nodes Test mounts at LSU, TACC, SDSC Login nodes LSU and SDSC ready to deploy on some compute nodes Future plans for NCSA, ORNL, and Purdue

3D Hydrodynamic Workflow HPSS PSC Altix 4700 IU Astronomy Department

J-WAN Josephine Palencia at PSC Lustre Advanced Features Testing Distributed OSTs Kerberos Eventually CMD Successful mount and testing with SDSC

Beyond

WIYN Telescope at Kitt Peak Metadata Acquisition Service HPSS NOAO/AURA/NSF IRAF Service

Many Thanks Josh Walgenbach, Justin Miller, Nathan Heald, James McGookey,, Scott Michael, Matt Link (IU) Kit Westneat (DDN) Sun/CFS support and engineering Michael Kluge, Guido Juckeland, Robert Henschel, Matthias Mueller (ZIH, Dresden) Thorbjorn Axellson (CReSIS) Craig Prescott (UFL) Ariel Martinez (LONI) Greg Pike and ORNL Doug Balog, Josephine Palencia, and PSC Support for this work provided by the National Science Foundation is gratefully acknowledged and appreciated (CNS-0521433)

Thank you! Questions? ssimms@indiana.edu dc-team-l@indiana.edu http://datacapacitor.iu.edu