RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

Similar documents
Big Data infrastructure and tools in libraries

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

Data Movement & Tiering with DMF 7

MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

VMWARE EBOOK. Easily Deployed Software-Defined Storage: A Customer Love Story

SoftNAS Cloud Performance Evaluation on Microsoft Azure

Parallel File Systems. John White Lawrence Berkeley National Lab

SoftNAS Cloud Performance Evaluation on AWS

Next Generation Storage for The Software-Defned World

Science-as-a-Service

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Chris Dwan - Bioteam

University at Buffalo Center for Computational Research

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Cloud & Datacenter EGA

EMC ISILON HARDWARE PLATFORM

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates

Data storage services at KEK/CRC -- status and plan

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

2017 Resource Allocations Competition Results

Terabit Networking with JASMIN

HPC Storage Use Cases & Future Trends

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

The BioHPC Nucleus Cluster & Future Developments

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

COMPUTATION RESEARCH COMPUTING STRENGTH. Since Steele in 2008, Research Computing has deployed many world-class offerings in computation

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge

Got Isilon? Need IOPS? Get Avere.

Cluster-Level Google How we use Colossus to improve storage efficiency

Organizational Update: December 2015

SLIDE 1 - COPYRIGHT 2015 ELEPHANT FLOWS IN THE ROOM: SCIENCEDMZ NATIONALLY DISTRIBUTED

Low-cost BYO Mass Storage Project

How do Design a Cluster

DDN About Us Solving Large Enterprise and Web Scale Challenges

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

Network Support for Data Intensive Science

High Performance Computing Resources at MSU

New Directions and BNL

Discover the all-flash storage company for the on-demand world

IBM řešení pro větší efektivitu ve správě dat - Store more with less

The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION

Singularity in CMS. Over a million containers served

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

IBM Active Cloud Engine centralized data protection

The Cirrus Research Computing Cloud

Cloud Computing. What is cloud computing. CS 537 Fall 2017

Introducing Panasas ActiveStor 14

Magellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009

Technology Challenges for Clouds. Henry Newman CTO Instrumental Inc

Public Cloud Leverage For IT/Business Alignment Business Goals Agility to speed time to market, adapt to market demands Elasticity to meet demand whil

Big Data 2015: Sponsor and Participants Research Event ""

Securing Your Campus

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

Emerging Technologies for HPC Storage

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

THE OTTAWA SENATORS SCORE ON A CLOUD ASSIST FROM BRITESKY TECHNOLOGIES

PeerStorage Arrays Unequalled Storage Solutions

The Oracle Database Appliance I/O and Performance Architecture

Outline. March 5, 2012 CIRMMT - McGill University 2

IMPORTANT WORDS AND WHAT THEY MEAN

Get More Out of Storage with Data Domain Deduplication Storage Systems

Moderator: Edward Seidel, Director, Center for Computation & Technology, Louisiana State University

Network Essentials for Superintendents

Virtualization. Q&A with an industry leader. Virtualization is rapidly becoming a fact of life for agency executives,

Data Movement and Storage. 04/07/09 1

HPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing

SEMINAR. Achieve 100% Backup Success! Achieve 100% Backup Success! Today s Goals. Today s Goals

LHC and LSST Use Cases

Experiences using a multi-tiered GPFS file system at Mount Sinai. Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang

Cloud Bursting: Top Reasons Your Organization will Benefit. Scott Jeschonek Director of Cloud Products Avere Systems

Flux: The State of the Cluster

BeeGFS. Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting

Business Resiliency in the Cloud: Reality or Hype?

Backup School. Presentation Prepared for Attendees of. exagrid.com 2

GPFS for Life Sciences at NERSC

ALICE Grid Activities in US

Storage for HPC, HPDA and Machine Learning (ML)

Nexsan Storage Optimizes Virtual Operating Environment for The Clark Enersen Partners

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research

Symantec Backup Exec Blueprints

Ten things hyperconvergence can do for you

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

ADDENDUM NO. 2 BID-CONTRACT DOCUMENTS FOR. BID No District Wide Active Directory and Exchange Server Upgrade and Consolidation

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

Tape Data Storage in Practice Minnesota Supercomputing Institute

Memory Management. Kevin Webb Swarthmore College February 27, 2018

Introduction to High Performance Parallel I/O

How do I Manage my Disk and Profile Quotas?

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Identifying Workloads for the Cloud

DISK LIBRARY FOR MAINFRAME (DLM)

Transcription:

Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016

Ran into Miron at a workshop recently.. Talked about data and the challenges of providing that service to campus. Miron: I d like to talk about that at HTCondor Week! Campuses and HTCondor sites all face challenges with storage, so.. (Dramatization)

COMMUNITY SECTION TITLE CLUSTERS A BUSINESS MODEL FOR COMPUTING AT PURDUE

COMMUNITY CLUSTERS THE BASIC RULES You get out at least what you put in Buy 1 node or 100, you get a queue that guarantees access up to that many CPUs But wait, there s more!! What if your neighbor isn t using his queue? You can use it, but your job is subject to a time limit You don t have to do the work Your grad student gets to do research rather than run your cluster. Nor do you have to provide space in your lab for computers. IT provides data center space, systems administration, application support. Just submit jobs!

COMMUNITY CLUSTERS VERSION 2: FURTHER REFINEMENT 5 Year cycle We build a cluster every year! Vendors provide 5 year warranty After 5 years, MOU with faculty says that the cluster will be retired Faculty get credit for the remaining market value of their nodes, towards the next cluster. Community clusters now appear to funding agencies as paying for a service not a capital purchase.

CHALLENGES CHALLENGES TO SMALLER COMMUNITIES HPC and HTC communities prefer different points to optimize the scheduler. Small but key communities (like large memory) lose benefits of standby queues when fewer nodes are spread between several clusters. HTC or large memory communities often have little need for HPC-specific optimizations Accelerators High-speed, low-latency networks Emerging communities often don t fit in existing model at all! Big Data Analytics Graphics Rendering Nontraditional platforms (Windows, cloud)

OBLIGATORY HTC/HTCONDOR CONTENT COMING

HTCONDOR We still run a fairly substantial HTCondor resource Serving CMS, opportunistic access for many OSG VOs, and several Purdue researchers Out of 240M hours delivered by our center last year, ~31M were High Throughput SEE, I LL TALK ABOUT IT!

DATA SECTION TITLE STORAGE INFRASTRUCTURE FOR RESEARCH DATA

SCRATCH HPC DATA Scratch needs are climbing Avg GB used (top 50 users) 8000 7000 6000 5000 4000 3000 2000 1000 0 Avg GB used (top 50 users)

FORTRESS HPC DATA Archive usage is skyrocketing Fortress Archive Growth 4,500,000.00 4,035,107.00 4,000,000.00 3,500,000.00 3,273,254.00 3,000,000.00 2,500,000.00 2,109,908.00 2,000,000.00 1,500,000.00 1,002,297.00 1,000,000.00 550,146.00 500,000.00-110,909.00 TB used

HPC STORAGE TIERS OF STORAGE Scr atc h Sto rag Working e space Archival Space Fast, large, purged, coupled with clusters, per-user for running jobs Medium speed, large, persistent, data protected, purchased, per research lab for shared data and apps High-speed, infinite capacity, highly-protected, available to all researchers for permanent storage

RESEARCH DATA ACROSS CAMPUS CURRENT STATE Working with researchers across campus, we encounter many different data solutions.. From something at the department/workgroup level:

RESEARCH DATA ACROSS CAMPUS To This CURRENT STATE

RESEARCH DATA ACROSS CAMPUS And This CURRENT STATE

RESEARCH STORAGE GAPS THE PROBLEM Many central storage options have not met all the needs that researchers care about Departmental or lab resources are islands and not accessible from HPC clusters. Most are individually-oriented, rather than built around the notion of a research lab. Scratch filesystems are also limited in scope to a single HPC system Some are not performant enough for research use Sometimes nothing is available for sale, despite faculty having funds

RESEARCH STORAGE GAPS COMMON QUESTIONS In the past, we ve heard lots of common requests: I need more space than I can get in scratch Where can I install applications for my entire research lab? I m actively working on that data/software in scratch: I have to go to great lengths to keep it from being purged. I shouldn t have to pull it from Fortress over and over Can I get a UNIX group created for my students and I? Is there storage that I can get to on all the clusters I use? I have funding to spend on storage what do you have to sell? I need storage for my instrument to write data into My student has the only copy of my research data in his home directory, and he graduated/went off the grid!

RESEARCH STORAGE GAPS SOME SOLUTIONS We ve addressed some of these with improving scratch: Everybody automatically gets access to HPSS Archive for permanent data storage Very large per-user scratch quotas More friendly purge policy based on the use of data, rather than when it was created.

DEPOT WHY A DEPOT? As a transport hub: a place where large amounts of cargo are stored, loaded, unloaded, and moved from place to place.

DATA GUIDING PRINCIPLES It s important to think of Depot as a data service not storage It is not enough to just provide infrastructure Here s a mountpoint, have fun Our goal: enabling the frictionless use and movement of data Instrument -> Depot -> Scratch -> Fortress -> Collaborators -> and back Continue to improve access to non-unix users

THE SERVICE FEATURES Purdue researchers can purchase storage! An affordable storage service for research to address many common requests: 100G available at no charge to research groups Mounted on all clusters and exported via CIFS to labs Not scratch: Backed up via snapshots, with DR coverage Data in Depot is owned by faculty member! Sharing ability Globus, CIFS, and WWW Maintain group-wide copies of application software or shared data

A TOOL-FRIENDLY STRUCTURE AS A STARTING POINT with POSIX ACLs to overcome group permission and umask challenges!

SELF-SERVICE MANAGE YOUR OWN ACCESS

A SOLUTION ADOPTION Well received! In just over a year, over 260 research groups are participating. Many are not HPC users!.75 PB in use since November 2014 A research group purchasing space has purchased, on average, 8.6TB.

A TESTIMONIAL SATISFIED FACULTY "It's really, really useful. It's actually one of the best things about this setup. In the past, with other HPC systems I've used, moving all the data around that these models output has always been a major pain. One of these high resolution simulations can produce terabytes of output and you've got to have somewhere you can put that in order to analyze it. The Research Data Depot is a place where I can put a large amount of data, I can mount it on local machines and access it. The Research Data Depot [is] large like the scratch but it's disk based and fast unlike the tape archive, so it's got the best of both world's, I think, and it's backed up! I think it's great. I use it a lot. I have 25 terabytes right now and I'd like to get more." Daniel Dawson, Assistant Professor of Earth, Atmospheric, and Planetary Science

THE TECHNOLOGY Approximately 2.25 PB of IBM GPFS Hardware provided by a pair of Data Direct Networks SFA12k arrays, one in each of two datacenters 160 Gb/sec to each datacenter 5x Dell R620 servers in each datacenter WHAT DID WE GET?

DESIGN TARGETS WHAT DO WE NEED TO DO? The Research Data Depot Can do: Depot Requirements Previous solutions At least 1 PB usable capacity >1 PB 40 GB/sec throughput 5 GB/sec < 3ms average latency, < 20 ms maximum latency Variable 100k IOPS sustained 55k 300 MB/sec min client speed 200 MB/sec max Support 3000 simultaneous clients Yes Filesystem snapshots Yes Multi-site replication No Expandable to 10 PB Yes Fully POSIX compliant, including parallel I/O No

GROWTH BY COLLEGE GROWTH AREAS 2015 New Groups 2014 Growth rate 2014-2015 Liberal Arts 7 6 1 600% Education 2 1 1 100% HHS 22 8 14 57% Agriculture 74 26 48 54% Science (bio) 21 6 15 40% Pharmacy 6 1 5 20% Engineering 188 27 161 17% Science (non-bio) 200 16 184 9% Technology 14 1 13 8% Management 20 0 20 0% Vet School 1 1 0 555 93 462 College 20% Life Science!

RELATED SECTION TITLE CHALLENGES BEYOND THE DATACENTER

IMPROVED NETWORKING WHAT DOES THIS MEAN FOR RESEARCHERS? As science gets more data-intensive researchers require increasing amounts of bandwidth (GigE on up) The last mile to their labs will be the key!

IMPROVED NETWORKING INSTRUMENTS Instruments are getting cheaper, more common, and generate more data. Researchers need high-speed (10Gb+) connections for labs and instruments to move data into clusters, Research Data Depot, and research WAN connections.

FIREWALLS The situation on campus is mixed, but researchers shouldn t have to choose. Watch for fast GigE links being throttled by 100Mb firewalls! We need to get researchers from DIY firewalls into the enterprise-grade one. SAFETY OR PERFORMANCE

GLOBUS IS KEY STATISTICS Data transferred in the last year: Volume: nearly 300 TB transferred! Avg 23 TB, 50 unique users each month Killer feature: sharing!

FUTURE WORK Need staff with expertise in the mechanics of working with data Work with Libraries to integrate with institutional repository How to centrally fund the storage?

THANK YOU! QUESTIONS? Questions? Contact Us: psmith@purdue.edu @PurdueRCAC http://www.facebook.com/purduercac