1 Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016
2 Ran into Miron at a workshop recently.. Talked about data and the challenges of providing that service to campus. Miron: I d like to talk about that at HTCondor Week! Campuses and HTCondor sites all face challenges with storage, so.. (Dramatization)
3 COMMUNITY SECTION TITLE CLUSTERS A BUSINESS MODEL FOR COMPUTING AT PURDUE
4 COMMUNITY CLUSTERS THE BASIC RULES You get out at least what you put in Buy 1 node or 100, you get a queue that guarantees access up to that many CPUs But wait, there s more!! What if your neighbor isn t using his queue? You can use it, but your job is subject to a time limit You don t have to do the work Your grad student gets to do research rather than run your cluster. Nor do you have to provide space in your lab for computers. IT provides data center space, systems administration, application support. Just submit jobs!
5 COMMUNITY CLUSTERS VERSION 2: FURTHER REFINEMENT 5 Year cycle We build a cluster every year! Vendors provide 5 year warranty After 5 years, MOU with faculty says that the cluster will be retired Faculty get credit for the remaining market value of their nodes, towards the next cluster. Community clusters now appear to funding agencies as paying for a service not a capital purchase.
6 CHALLENGES CHALLENGES TO SMALLER COMMUNITIES HPC and HTC communities prefer different points to optimize the scheduler. Small but key communities (like large memory) lose benefits of standby queues when fewer nodes are spread between several clusters. HTC or large memory communities often have little need for HPC-specific optimizations Accelerators High-speed, low-latency networks Emerging communities often don t fit in existing model at all! Big Data Analytics Graphics Rendering Nontraditional platforms (Windows, cloud)
7 OBLIGATORY HTC/HTCONDOR CONTENT COMING
8 HTCONDOR We still run a fairly substantial HTCondor resource Serving CMS, opportunistic access for many OSG VOs, and several Purdue researchers Out of 240M hours delivered by our center last year, ~31M were High Throughput SEE, I LL TALK ABOUT IT!
9 DATA SECTION TITLE STORAGE INFRASTRUCTURE FOR RESEARCH DATA
10 SCRATCH HPC DATA Scratch needs are climbing Avg GB used (top 50 users) Avg GB used (top 50 users)
11 FORTRESS HPC DATA Archive usage is skyrocketing Fortress Archive Growth 4,500, ,035, ,000, ,500, ,273, ,000, ,500, ,109, ,000, ,500, ,002, ,000, , , , TB used
12 HPC STORAGE TIERS OF STORAGE Scr atc h Sto rag Working e space Archival Space Fast, large, purged, coupled with clusters, per-user for running jobs Medium speed, large, persistent, data protected, purchased, per research lab for shared data and apps High-speed, infinite capacity, highly-protected, available to all researchers for permanent storage
13 RESEARCH DATA ACROSS CAMPUS CURRENT STATE Working with researchers across campus, we encounter many different data solutions.. From something at the department/workgroup level:
14 RESEARCH DATA ACROSS CAMPUS To This CURRENT STATE
15 RESEARCH DATA ACROSS CAMPUS And This CURRENT STATE
16 RESEARCH STORAGE GAPS THE PROBLEM Many central storage options have not met all the needs that researchers care about Departmental or lab resources are islands and not accessible from HPC clusters. Most are individually-oriented, rather than built around the notion of a research lab. Scratch filesystems are also limited in scope to a single HPC system Some are not performant enough for research use Sometimes nothing is available for sale, despite faculty having funds
17 RESEARCH STORAGE GAPS COMMON QUESTIONS In the past, we ve heard lots of common requests: I need more space than I can get in scratch Where can I install applications for my entire research lab? I m actively working on that data/software in scratch: I have to go to great lengths to keep it from being purged. I shouldn t have to pull it from Fortress over and over Can I get a UNIX group created for my students and I? Is there storage that I can get to on all the clusters I use? I have funding to spend on storage what do you have to sell? I need storage for my instrument to write data into My student has the only copy of my research data in his home directory, and he graduated/went off the grid!
18 RESEARCH STORAGE GAPS SOME SOLUTIONS We ve addressed some of these with improving scratch: Everybody automatically gets access to HPSS Archive for permanent data storage Very large per-user scratch quotas More friendly purge policy based on the use of data, rather than when it was created.
19 DEPOT WHY A DEPOT? As a transport hub: a place where large amounts of cargo are stored, loaded, unloaded, and moved from place to place.
20 DATA GUIDING PRINCIPLES It s important to think of Depot as a data service not storage It is not enough to just provide infrastructure Here s a mountpoint, have fun Our goal: enabling the frictionless use and movement of data Instrument -> Depot -> Scratch -> Fortress -> Collaborators -> and back Continue to improve access to non-unix users
21 THE SERVICE FEATURES Purdue researchers can purchase storage! An affordable storage service for research to address many common requests: 100G available at no charge to research groups Mounted on all clusters and exported via CIFS to labs Not scratch: Backed up via snapshots, with DR coverage Data in Depot is owned by faculty member! Sharing ability Globus, CIFS, and WWW Maintain group-wide copies of application software or shared data
22 A TOOL-FRIENDLY STRUCTURE AS A STARTING POINT with POSIX ACLs to overcome group permission and umask challenges!
23 SELF-SERVICE MANAGE YOUR OWN ACCESS
24 A SOLUTION ADOPTION Well received! In just over a year, over 260 research groups are participating. Many are not HPC users!.75 PB in use since November 2014 A research group purchasing space has purchased, on average, 8.6TB.
25 A TESTIMONIAL SATISFIED FACULTY "It's really, really useful. It's actually one of the best things about this setup. In the past, with other HPC systems I've used, moving all the data around that these models output has always been a major pain. One of these high resolution simulations can produce terabytes of output and you've got to have somewhere you can put that in order to analyze it. The Research Data Depot is a place where I can put a large amount of data, I can mount it on local machines and access it. The Research Data Depot [is] large like the scratch but it's disk based and fast unlike the tape archive, so it's got the best of both world's, I think, and it's backed up! I think it's great. I use it a lot. I have 25 terabytes right now and I'd like to get more." Daniel Dawson, Assistant Professor of Earth, Atmospheric, and Planetary Science
26 THE TECHNOLOGY Approximately 2.25 PB of IBM GPFS Hardware provided by a pair of Data Direct Networks SFA12k arrays, one in each of two datacenters 160 Gb/sec to each datacenter 5x Dell R620 servers in each datacenter WHAT DID WE GET?
27 DESIGN TARGETS WHAT DO WE NEED TO DO? The Research Data Depot Can do: Depot Requirements Previous solutions At least 1 PB usable capacity >1 PB 40 GB/sec throughput 5 GB/sec < 3ms average latency, < 20 ms maximum latency Variable 100k IOPS sustained 55k 300 MB/sec min client speed 200 MB/sec max Support 3000 simultaneous clients Yes Filesystem snapshots Yes Multi-site replication No Expandable to 10 PB Yes Fully POSIX compliant, including parallel I/O No
28 GROWTH BY COLLEGE GROWTH AREAS 2015 New Groups 2014 Growth rate Liberal Arts % Education % HHS % Agriculture % Science (bio) % Pharmacy % Engineering % Science (non-bio) % Technology % Management % Vet School College 20% Life Science!
29 RELATED SECTION TITLE CHALLENGES BEYOND THE DATACENTER
30 IMPROVED NETWORKING WHAT DOES THIS MEAN FOR RESEARCHERS? As science gets more data-intensive researchers require increasing amounts of bandwidth (GigE on up) The last mile to their labs will be the key!
31 IMPROVED NETWORKING INSTRUMENTS Instruments are getting cheaper, more common, and generate more data. Researchers need high-speed (10Gb+) connections for labs and instruments to move data into clusters, Research Data Depot, and research WAN connections.
32 FIREWALLS The situation on campus is mixed, but researchers shouldn t have to choose. Watch for fast GigE links being throttled by 100Mb firewalls! We need to get researchers from DIY firewalls into the enterprise-grade one. SAFETY OR PERFORMANCE
33 GLOBUS IS KEY STATISTICS Data transferred in the last year: Volume: nearly 300 TB transferred! Avg 23 TB, 50 unique users each month Killer feature: sharing!
34 FUTURE WORK Need staff with expertise in the mechanics of working with data Work with Libraries to integrate with institutional repository How to centrally fund the storage?
Line Pouchard, PhD Purdue University Libraries Research Data Group Big Data infrastructure and tools in libraries 08/10/2016 DATA IN LIBRARIES: THE BIG PICTURE IFLA/ UNIVERSITY OF CHICAGO BIG DATA: A VERY
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
Success Story: MD-HQ Utilizes Atlantic.Net s Private Cloud Solutions to Realize Tremendous Growth Atlantic.Net specializes in providing security and compliance hosting solutions, most specifically in the
THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology
VMWARE EBOOK Easily Deployed Software-Defined Storage: A Customer Love Story TABLE OF CONTENTS The Software-Defined Data Center... 1 VMware Virtual SAN... 3 A Proven Enterprise Platform... 4 Proven Results:
Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation
` Next Generation Storage for The Software-Defned World John Hofer Solution Architect Red Hat, Inc. BUSINESS PAINS DEMAND NEW MODELS CLOUD ARCHITECTURES PROPRIETARY/TRADITIONAL ARCHITECTURES High up-front
Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services
Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing
Chris Dwan - Bioteam Scientists with production HPC skills Bridging the gap between informatics & IT Vendor & technology agnostic A resource for labs and workgroups that don t have their own supercomputing
University at Buffalo Center for Computational Research The following is a short and long description of CCR Facilities for use in proposals, reports, and presentations. If desired, a letter of support
Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s
WVU RESEARCH COMPUTING INTRODUCTION Introduction to WVU s Research Computing Services WHO ARE WE? Division of Information Technology Services Funded through WVU Research Corporation Provide centralized
EMC ISILON HARDWARE PLATFORM Three flexible product lines that can be combined in a single file system tailored to specific business needs. S-SERIES Purpose-built for highly transactional & IOPSintensive
Cloud & Datacenter EGA The Stock Exchange of Thailand Materials excerpt from SET internal presentation and virtualization vendor e.g. vmware For Educational purpose and Internal Use Only SET Virtualization/Cloud
IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage Silverton Consulting, Inc. StorInt Briefing 2017 SILVERTON CONSULTING, INC. ALL RIGHTS RESERVED Page 2 Introduction Unstructured data has
Worldwide Production Distributed Data Management at the LHC Brian Bockelman MSST 2010, 4 May 2010 At the LHC http://op-webtools.web.cern.ch/opwebtools/vistar/vistars.php?usr=lhc1 Gratuitous detector pictures:
Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively
Terabit Networking with JASMIN Jonathan Churchill JASMIN Infrastructure Manager Research Infrastructure Group Scientific Computing Department STFC Rutherford Appleton Labs Terabit Networking with JASMIN
Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center
Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH Cloud Storage with AWS Cloud storage is a critical component of cloud computing, holding the information used by applications. Big data analytics,
1 The BioHPC Nucleus Cluster & Future Developments Overview Today we ll talk about the BioHPC Nucleus HPC cluster with some technical details for those interested! How is it designed? What hardware does
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment
Preston Smith Director of Research Computing Services Alex Younts Senior Research Engineer 2017 COMMUNITY 8/29/2017 HPC FACULTY MEETING CLUSTER COMPUTATION RESEARCH COMPUTING STRENGTH Since Steele in 2008,
Got Isilon? Need IOPS? Get Avere. Scalable I/O Performance to Complement Any EMC Isilon Environment By: Jeff Tabor, Director of Product Marketing Achieving Performance Scaling Overcoming Random I/O and
Cluster-Level Storage @ Google How we use Colossus to improve storage efficiency Denis Serenyi Senior Staff Software Engineer email@example.com November 13, 2017 Keynote at the 2nd Joint International
HPC Growing Pains IT Lessons Learned from the Biomedical Data Deluge John L. Wofford Center for Computational Biology & Bioinformatics Columbia University What is? Internationally recognized biomedical
Low-cost BYO Mass Storage Project James Cizek Unix Systems Manager Academic Computing and Networking Services The Problem Reduced Budget Storage needs growing Storage needs changing (Tiered Storage) Long
SLIDE 1 - COPYRIGHT 2015 ELEPHANT FLOWS IN THE ROOM: SCIENCEDMZ NATIONALLY DISTRIBUTED SLIDE 2 - COPYRIGHT 2015 Do you know what your campus network is actually capable of? (i.e. have you addressed your
1 DDN About Us Solving Large Enterprise and Web Scale Challenges History Founded in 98 World s Largest Private Storage Company Growing, Profitable, Self Funded Headquarters: Santa Clara and Chatsworth,
How do Design a Cluster Dana Brunson Asst. VP for Research Cyberinfrastructure Director, Adjunct Assoc. Professor, CS & Math Depts. Oklahoma State University http://hpcc.okstate.edu It depends. -- Henry
Network Support for Data Intensive Science Eli Dart, Network Engineer ESnet Network Engineering Group ARN2 Workshop Washington, DC April 18, 2013 Overview Drivers Sociology Path Forward 4/19/13 2 Exponential
MICHIGAN STATE UNIVERSITY High Performance Computing Resources at MSU Last Update: August 15, 2017 Institute for Cyber-Enabled Research Misson icer is MSU s central research computing facility. The unit
Discover the all-flash storage company for the on-demand world STORAGE FOR WHAT S NEXT The applications we use in our personal lives have raised the level of expectations for the user experience in enterprise
IBM řešení pro větší efektivitu ve správě dat - Store more with less IDG StorageWorld 2012 Rudolf Hruška Information Infrastructure Leader IBM Systems & Technology Group firstname.lastname@example.org IBM Agenda
The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION The future of storage is flash The all-flash datacenter is a viable alternative You ve heard it
Singularity in CMS Over a million containers served Introduction The topic of containers is broad - and this is a 15 minute talk! I m filtering out a lot of relevant details, particularly why we are using
LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE Luanne Dauber, Pure Storage Author: Matt Kixmoeller, Pure Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless
IBM Active Cloud Engine centralized data protection Best practices guide Sanjay Sudam IBM Systems and Technology Group ISV Enablement December 2013 Copyright IBM Corporation, 2013 Table of contents Abstract...
The Cirrus Research Computing Cloud Faculty of Science What is Cloud Computing? Cloud computing is a physical cluster which runs virtual machines Unlike a typical cluster there is no one operating system
Cloud Computing CS 537 Fall 2017 What is cloud computing Illusion of infinite computing resources available on demand Scale-up for most apps Elimination of up-front commitment Small initial investment,
Magellan Project Jeff Broughton NERSC Systems Department Head October 7, 2009 1 Magellan Background National Energy Research Scientific Computing Center (NERSC) Argonne Leadership Computing Facility (ALCF)
Technology Challenges for Clouds Henry Newman CTO Instrumental Inc email@example.com I/O Modeling Plan: What is the Weather March 31, 2010 Henry Newman Instrumental Inc/CTO firstname.lastname@example.org Agenda
LHC2386BU True Costs Savings Modeling and Costing A Migration to VMware Cloud on AWS Chris Grossmeier email@example.com John Blumenthal firstname.lastname@example.org #VMworld Public Cloud Leverage For IT/Business
Securing Your Campus Copyright CheckVideo LLC. All Rights Reserved. Introduction Security and Safety Officers are committed to providing a safe and welcoming learning environment that prepares students
Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication
THE OTTAWA SENATORS SCORE ON A CLOUD ASSIST FROM BRITESKY TECHNOLOGIES By relying on Britesky Technologies for cloud services, the Ottawa Senators can focus on the thing that matters most the fans. The
Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional
Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
MOBILE PHONES WHAT IS DATA Data is Internet. It can let you do lots of different things on your phone or tablet. You can send or receive texts, emails or photos, listen to music, watch TV shows, movies
Data Movement and Storage 04/07/09 www.cac.cornell.edu 1 Data Location, Storage, Sharing and Movement Four of the seven main challenges of Data Intensive Computing, according to SC06. (Other three: viewing,
1 Get More Out of Storage with Data Domain Deduplication Storage Systems David M. Auslander Sales Director, New England / Eastern Canada 2 EMC Data Domain Dedupe everything without changing anything Simplify
Ask A Grid Expert Panel & Interaction Thursday January 6, 2005, 4:00-5:30PM Moderator: Edward Seidel, Director, Center for Computation & Technology, Louisiana State University Panelists: Gabrielle Allen,
Presented by: George Crump President and Founder, Storage Switzerland www.storagedecisions.com Today s Goals Achieve 100% Backup Success! o The Data Protection Problem Bad Foundation, New Problems o Fixing
HPC File Systems and Storage Irena Johnson University of Notre Dame Center for Research Computing HPC (High Performance Computing) Aggregating computer power for higher performance than that of a typical
LHC and LSST Use Cases Depots Network 0 100 200 300 A B C Paul Sheldon & Alan Tackett Vanderbilt University LHC Data Movement and Placement n Model must evolve n Was: Hierarchical, strategic pre- placement
Experiences using a multi-tiered GPFS file system at Mount Sinai Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang Outline 1. Storage summary 2. Planning and Migration 3. Challenges
Cloud Bursting: Top Reasons Your Organization will Benefit Scott Jeschonek Director of Cloud Products Avere Systems Agenda Define Cloud Bursting Benefits of using Cloud Bursting Identify Cloud Bursting
Network Essentials for Superintendents Upgrade your network for digital learning How can I improve my network? As a district leader, you can help most in the planning and implementation of a network upgrade
Flux: The State of the Cluster Andrew Caird email@example.com 7 November 2012 Questions Thank you all for coming. Questions? Andy Caird (firstname.lastname@example.org, email@example.com) Flux Since Last November
GPFS for Life Sciences at NERSC A NERSC & JGI collaborative effort Jason Hick, Rei Lee, Ravi Cheema, and Kjiersten Fagnan GPFS User Group meeting May 20, 2015-1 - Overview of Bioinformatics - 2 - A High-level
Business Resiliency in the Cloud: Reality or Hype? Karen Jaworski Senior Director, Product Marketing EVault, a Seagate Company 8/10/2012 2012 EVault, Inc. All Rights Reserved 1 Who is EVault? Cloud-Connected
exagrid.com 1 Backup School Presentation Prepared for Attendees of www.exagrid.com exagrid.com 2 About ExaGrid ExaGrid Largest independent vendor in the space Gartner Magic Quadrant Visionary 100% Focused
ALICE Grid Activities in US 1 ALICE-USA Computing Project ALICE-USA Collaboration formed to focus on the ALICE EMCal project Construction, installation, testing and integration participating institutions
Nexsan 10 Minute Case Study Nexsan Storage Optimizes Virtual Operating Environment for The Clark Enersen Partners Quick Summary Mounting storage requirements were driving the need for design firm The Clark
High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European
Ten things hyperconvergence can do for you Francis O Haire Director, Technology & Strategy DataSolutions Evolution of Enterprise Infrastructure 1990s Today Virtualization Server Server Server Server Scale-Out
The Blue Water s File/Archive System Data Management Challenges Michelle Butler (firstname.lastname@example.org) NCSA is a World leader in deploying supercomputers and providing scientists with the software
Provisioning with SUSE Enterprise Storage Nyers Gábor Trainer & Consultant @Trebut email@example.com Managing storage growth and costs of the software-defined datacenter PRESENT Easily scale and manage
A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research Storage Platforms with Aspera Overview A growing number of organizations with data-intensive
ADDENDUM NO. 2 BID-CONTRACT DOCUMENTS FOR BID No. 2015 District Wide Active Directory and Exchange Server Upgrade and Consolidation COAST COMMUNITY COLLEGE DISTRICT COSTA MESA, CALIFORNIA COUNTY OF ORANGE
DDN s Vision for the Future of Lustre LUG2015 Robert Triendl 3 Topics 1. The Changing Markets for Lustre 2. A Vision for Lustre that isn t Exascale 3. Building Lustre for the Future 4. Peak vs. Operational
Deep Storage for Exponential Data Nathan Thompson CEO, Spectra Logic HISTORY Partnered with Fujifilm on a variety of projects HQ in Boulder, 35 years of business Customers in 54 countries Spectra builds
Tape Data Storage in Practice Minnesota Supercomputing Institute GlobusWorld 208 Jeffrey McDonald, PhD Associate Director for Operations HPC Resources MSI Users PI Accounts: 843 Users: > 4600 Mesabi Cores:
Introduction to High Performance Parallel I/O Richard Gerber Deputy Group Lead NERSC User Services August 30, 2013-1- Some slides from Katie Antypas I/O Needs Getting Bigger All the Time I/O needs growing
How do I Manage my Disk and Profile Quotas? HTS controls the amount of disk space that each customer can use on its Windows servers. This is necessary to ensure efficiency and performance of the servers.