ANL Site Update. ScicomP/SPXXL Summer Meeting Ray Loy Ben Allen Gordon McPheeters Jack O'Connell William Scullin Tim Williams
|
|
- Arlene Reynolds
- 5 years ago
- Views:
Transcription
1 ANL Site Update ScicomP/SPXXL Summer Meeting 2016 Ray Loy Ben Allen Gordon McPheeters Jack O'Connell William Scullin Tim Williams
2 Site Update System overview Support topics GPFS AFM Burst Buffer GHI deployment Performance Portability 2
3 ALCF Resources 3
4 ALCF Roadmap CORAL Theta interim system (late 2016) Cray/Intel KNL >=2500 nodes, >= 60 cores/node (8.5PF) Peak comparable to Mira Aurora late 2018 ALCF-3, successor to Mira Cray/Intel KNH >=50K nodes Blue Gene will continue to be a primary resource through the installation of Aurora Both HW and SW considerations for extended life 4
5 Support/Requirements 5
6 GPFS on BG/Q Current: ESS GPFS ; DDNs, BG/Q GPFS 3.5 Cannot upgrade ESS without BG/Q GPFS >=4.1 End-of-service for GPFS 3.5 coming up in 2017 BG/Q will continue in service well beyond that Support for GPFS 4.1 (or 4.2) on BG/Q IONs would be very helpful Concern that fixes for GPFS and AFM at GPFS 4.2 will continue to be back-ported to GPFS in a timely manner ANL supports SPXXL 2016 requirement (Ticket 21) 6
7 Compilers IBM XL Compilers for Blue Gene/Q XL Fortran 14.1 Current BG/Q version August 2015 AIX, Linux last update April/May 2016 XL C/C Current BG/Q version August 2015 AIX, Linux last update April 2016 Hoping for BG/Q update 7
8 PMRs PMR getdents/getdents64 alias conflict with toolchain [Apr 2016] Success - Fixed in V1R2M4 PMR No way to list contents of node's /dev/shmem or /dev/persistent from a CNK program (Can only access from off-node using CDTI) Negotiating with IBM on a solution (stalled on ANL) 8
9 Implementing GPFS AFM as a Burst Buffer 9
10 ALCF Operational File System Configuration There are three main production level GPFS file systems: mira-fs0 19 PiB mira-fs1-7 PiB mira-home 1.1 PiB These are based on DDN 12K-E (Embedded NSD servers) storage. These file systems are mounted to all the BG IO servers, a Cray visualization cluster, Globus DTN nodes and HPSS data movers. symlinks to the project filesets are used to mask the underlying file system from the users. /project contains links to mira-fs0 and mirafs1 filesets. 10
11 The big picture Current Step 1 AFM Step 2 GHI A,B,C,D,E,F,G,H 8PB, 400 GB/s fs2 (30 ESS) A,B,C,D,E,F,G,H 8PB, 400 GB/s fs2 (30 ESS) A,C,D,G,H 21PB,240 GB/s fs0 (16 DDN) B,E,F 7PB, 90 GB/s fs1 (6 DDN) A,C,D,G,H 21PB,240 GB/s fs0 (16 DDN) B,E,F 7PB, 90 GB/s fs1 (6 DDN) A,C,D,G,H 21PB,240 GB/s fs0 (16 DDN) B,E,F 7PB, 90 GB/s fs1 (6 DDN) HPSS via GHI 11
12 ESS 50,000 foot view Terminology: AFM: Active File Management cache fs2, non-permanent location home fs0, fs1, permanent location; not /home, which is confusing fs2 is a cache; Like any other cache, files can be evicted; AFM ensures there is a good copy in home before eviction. Eviction is policy driven; Our policies will probably be fairly simple (high water / lower water and LRU), but most any file system parameter can be checked) AFM is constantly syncing files between the cache and home Default is every 15 seconds, but is alterable Basically replays events from the cache on home (metadata updates, block changes, etc.) We can influence priorities of new writes vs. replication with tuning parameters, but true QoS is not here yet (it is in the roadmap) Fundamental assumption: Our file systems are big enough that recalls will be rare 12
13 Why? Overall, better experience for both the users and the admins No single point of failure We got this by adding fs1 to the existing fs0 file system All users see the same performance There is not uniform bw between fs0 and fs1, but fs2 and AFM give it back A cache miss might cause run to run variation, but all users are equally subject to this and given our cache size, the assumption is this will be rare. Minimal user/admin intervention required Ideally policies will do the right thing at the right time Example: Project data will automatically disappear, via GHI, after a project completes; Storage team doesn t need to do anything. Should trivially (single command) be able to cause the right behavior Probably based on extended attribute and policy Minimal (ideally zero) manual copying of files All handled by the file system 13
14 Implementation Overview Current State: Internal testing 2 ESS node pairs deleted from ESS and used to form a test home cluster. File system called: fm-fs1. Using AFM mode that allows for a GPFS home file system. mira-fs0 and mira-fs1 will be remotely mounted to ESS system mira-home will not be AFM managed Home file system defined as: gpfs:/// Null server list which signifies remotely mounted to cache cluster 14
15 Implementation Challenges This cluster was originally based on the GSS product which was xseries (Intel) based. Cluster was migrated from xseries to pseries (ESS) to address support concerns. Kernel Panics in mpt2sas kernel module Redhat bug: (bug) / (back port to 7.1 zstream) Difficult PD/recreate path. Much of the work was done by ALCF dealing directly with Red Hat and Avago. Up to 3 weeks between recreates. Fixed in RH 7.2 stream but since ESS won t support RH 7.2 in calendar year 2016 we needed to ask Red Hat for a port to RH 7.1 zstream. This fix is under final tests now at Red Hat. As soon as available this will be incorporated in our ESS cluster. Fixed in: kernel el7 15
16 Implementation Challenges Quotas: No communication between home and cache WRT to quota settings. Data is sync ed to home as root and root does not error on over quota Found best to turn off auto-migration on cache filesets. Prevents over-running home cache hard limits Bug found after file evicted and subsequently re-read from cache cluster it was not becoming resident again in cache. Efix under construction. 16
17 HSI deployment 17
18 File System Backup Overview Ø Mira-home is intended to store executable files and configura[on files. Ø User quota limits are enforced. Ø Data is fully protected thru GPFS metadata and data replica[on, GPFS snapshots and nightly backups. Ø Mira-fs0 and mira-fs1 are intended as intermediate-term storage for Mira/Cetus job output such as checkpoint datasets. Ø Project associated fileset quota limits are enforced. Ø Only metadata replica[on is enabled. Ø The user is responsible for archiving their own data. Ø Aaer project expira[on, quota limits are reduced, data is archived and the fileset is removed.
19 Current archiving (fs0, fs1) Users manually archive files via HSI or Globus/GridFTP When user project expires, script invokes HSI Copy to tape, shrink disk quota 90 days after expiration unmount (but still on disk) 180 days delete from disk, copy retained on tape No other system-related archiving 19
20 GHI value added Instead of scripting archiving directly, generate GPFS policy files to tell GHI when/how to migrate expired projects Improved disaster recovery Migrate everything (leave in place on disk) Optional: Implement threshold policy e.g. when fs reaches 90% start migration down to 80% (punch holes in files leaving metadata) GHI image backup create image of the entire fs only as "punched out" files (metadata) Initial restore of entire fs quickly, retrieve remaining contents on demand until caught up 20
21 GHI Deployment Status In progress (~6 mo) GHI installed and enabled on a test fs Some delays due to PMR (snapshots not deleted, deletion times out due to other processes) Other sites report issue resolved Expecting deployment this year 21
22 High Level GHI Integration View GPFS Cluster HPSS GPFS NSD Nodes GHI Session Node Data Mover Data Mover DDN Storage GHI IOM Nodes Disk Cache Tape 1GbE 10GbE QDR IB FC 4 x 6Gb/s SAS
23 Portability/Interoperability 23
24 Coordinating SC Centers Efforts Meetings of cross-labs applications readiness staff Tools and libraries working group (W. Joubert) Cross-lab training committee (F. Foerttner) Shared calendar Shared training events Manage nondisclosure, export control challenges CORAL partners, APEX partners (NERSC & Trinity) Portability March 2014 Mee[ng Apps readiness coordina[on ~15 representa[ves September 2014 Mee[ng Apps Portability Apps readiness coordina[on ~25 representa[ves January 2015 Mee[ng Next-gen hardware & soaware Portable programming ~40 center staff ~10 vendor reps 24
25 OpenMP 4.5 Source Code Portability double x[128], y[128]; #pragma omp target data map(to:x[0:64]) map(tofrom:y[0;64]) { #pragma omp target { // y computed on device } } double x[128], y[128]; #pragma omp for simd aligned(x, y: 32) for (int i=0; i<128; i++ ) { // thread s iterates à SIMD lanes } #pragma omp target data map(to:x) { #pragma omp target map(tofrom:y) #pragma omp teams #pragma omp distribute #pragma omp parallel for for (int i = 0; i < n; ++i) { y[i] = a*x[i] + y[i]; } } Host (CPU) Device (accelerator) GPGPU Intel Mic omp_get_num_devices() map maybe shared location simd standard vectorization #ifdef MANYCORE #else GPU #endif <code> 25
26 Performance Portability When directives-based approach performs: OpenMP 4.5 Use libraries or frameworks to encapsulate node level parallelism OpenMP 4.0 Libraries App Modules CUDA Vector intrinsics shmem Charm++ ADLB PETSc Chombo Trilinos MADNESS Kokkos RAJA 26
27 Effort Continuing Coordination between ESP, CAAR, NESAP. ALCC allocation at ANL, OLCF, NESAP for portability work Shared training including live/videocon Bringing NNSA labs into the discussion HPCOR (9/2015), COEPP (4/2016) SC15 Workshop on Portability Among HPC Architectures for Scientific Applications New effort on kernels/mini-apps: optimize on 2 platforms, then rework with OMP 4.5 or OpenACC, compare NekBone (ALCF), BoxLib (NERSC), DSL-based library for MD (OLCF) 27
28 Training - ATPESC Argonne Training Program on Extreme Scale Computing Intensive 2-week course held off-site Audience: doctoral students, postdocs, and computational scientists Presenters are leaders in all major areas of HPC Planning in progress for the 4 th session in August
29 The End 29
GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations
GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations Argonne National Laboratory Argonne National Laboratory is located on 1,500
More informationNERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber
NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationNCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017
NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017 Overview The Globally Accessible Data Environment (GLADE) provides centralized file storage for HPC computational, data-analysis,
More informationStorage Supporting DOE Science
Storage Supporting DOE Science Jason Hick jhick@lbl.gov NERSC LBNL http://www.nersc.gov/nusers/systems/hpss/ http://www.nersc.gov/nusers/systems/ngf/ May 12, 2011 The Production Facility for DOE Office
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationThe Blue Water s File/Archive System. Data Management Challenges Michelle Butler
The Blue Water s File/Archive System Data Management Challenges Michelle Butler (mbutler@ncsa.illinois.edu) NCSA is a World leader in deploying supercomputers and providing scientists with the software
More informationConfiguring IBM Spectrum Protect for IBM Spectrum Scale Active File Management
IBM Spectrum Protect Configuring IBM Spectrum Protect for IBM Spectrum Scale Active File Management Document version 1.4 Dominic Müller-Wicke IBM Spectrum Protect Development Nils Haustein EMEA Storage
More informationIntroduction to High Performance Parallel I/O
Introduction to High Performance Parallel I/O Richard Gerber Deputy Group Lead NERSC User Services August 30, 2013-1- Some slides from Katie Antypas I/O Needs Getting Bigger All the Time I/O needs growing
More informationHPC Storage Use Cases & Future Trends
Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively
More informationHPSS Development Update and Roadmap 2017
HPSS Development Update and Roadmap 2017 User Focused, Forward Looking http://www.hpss-collaboration.org 1 Disclaimer Forward looking information including schedules and future software reflect current
More informationNVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU
NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated
More informationPresent and Future Leadership Computers at OLCF
Present and Future Leadership Computers at OLCF Al Geist ORNL Corporate Fellow DOE Data/Viz PI Meeting January 13-15, 2015 Walnut Creek, CA ORNL is managed by UT-Battelle for the US Department of Energy
More informationIntel Xeon Phi архитектура, модели программирования, оптимизация.
Нижний Новгород, 2017 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Дмитрий Рябцев, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationEarly X1 Experiences at Boeing. Jim Glidewell Information Technology Services Boeing Shared Services Group
Early X1 Experiences at Boeing Jim Glidewell Information Technology Services Boeing Shared Services Group Early X1 Experiences at Boeing HPC computing environment X1 configuration Hardware and OS Applications
More informationALCF Operations Best Practices and Highlights. Mark Fahey The 6 th AICS International Symposium Feb 22-23, 2016 RIKEN AICS, Kobe, Japan
ALCF Operations Best Practices and Highlights Mark Fahey The 6 th AICS International Symposium Feb 22-23, 2016 RIKEN AICS, Kobe, Japan Overview ALCF Organization and Structure IBM BG/Q Mira Storage/Archive
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationLLVM and Clang on the Most Powerful Supercomputer in the World
LLVM and Clang on the Most Powerful Supercomputer in the World Hal Finkel November 7, 2012 The 2012 LLVM Developers Meeting Hal Finkel (Argonne National Laboratory) LLVM and Clang on the BG/Q November
More informationIllinois Proposal Considerations Greg Bauer
- 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and
More informationTSM HSM Explained. Agenda. Oxford University TSM Symposium Introduction. How it works. Best Practices. Futures. References. IBM Software Group
IBM Software Group TSM HSM Explained Oxford University TSM Symposium 2003 Christian Bolik (bolik@de.ibm.com) IBM Tivoli Storage SW Development 1 Agenda Introduction How it works Best Practices Futures
More informationDo You Know What Your I/O Is Doing? (and how to fix it?) William Gropp
Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp www.cs.illinois.edu/~wgropp Messages Current I/O performance is often appallingly poor Even relative to what current systems can achieve
More informationIntel Xeon Phi архитектура, модели программирования, оптимизация.
Нижний Новгород, 2016 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture How Programming
More informationBeyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center
Beyond Petascale Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center GPFS Research and Development! GPFS product originated at IBM Almaden Research Laboratory! Research continues to
More informationS Comparing OpenACC 2.5 and OpenMP 4.5
April 4-7, 2016 Silicon Valley S6410 - Comparing OpenACC 2.5 and OpenMP 4.5 James Beyer, NVIDIA Jeff Larkin, NVIDIA GTC16 April 7, 2016 History of OpenMP & OpenACC AGENDA Philosophical Differences Technical
More informationExperiences with ENZO on the Intel Many Integrated Core Architecture
Experiences with ENZO on the Intel Many Integrated Core Architecture Dr. Robert Harkness National Institute for Computational Sciences April 10th, 2012 Overview ENZO applications at petascale ENZO and
More informationTrends in HPC (hardware complexity and software challenges)
Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18
More informationInsights into TSM/HSM for UNIX and Windows
IBM Software Group Insights into TSM/HSM for UNIX and Windows Oxford University TSM Symposium 2005 Jens-Peter Akelbein (akelbein@de.ibm.com) IBM Tivoli Storage SW Development 1 IBM Software Group Tivoli
More informationAn Introduction to GPFS
IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4
More informationTransferring a Petabyte in a Day. Raj Kettimuthu, Zhengchun Liu, David Wheeler, Ian Foster, Katrin Heitmann, Franck Cappello
Transferring a Petabyte in a Day Raj Kettimuthu, Zhengchun Liu, David Wheeler, Ian Foster, Katrin Heitmann, Franck Cappello Huge amount of data from extreme scale simulations and experiments Systems have
More informationExtraordinary HPC file system solutions at KIT
Extraordinary HPC file system solutions at KIT Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State Roland of Baden-Württemberg Laifer Lustre and tools for ldiskfs investigation
More informationGPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009
GPFS on a Cray XT Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009 Outline NERSC Global File System GPFS Overview Comparison of Lustre and GPFS
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationLLVM for the future of Supercomputing
LLVM for the future of Supercomputing Hal Finkel hfinkel@anl.gov 2017-03-27 2017 European LLVM Developers' Meeting What is Supercomputing? Computing for large, tightly-coupled problems. Lots of computational
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationParallel File Systems. John White Lawrence Berkeley National Lab
Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation
More informationLustre usages and experiences
Lustre usages and experiences at German Climate Computing Centre in Hamburg Carsten Beyer High Performance Computing Center Exclusively for the German Climate Research Limited Company, non-profit Staff:
More informationAddressing Heterogeneity in Manycore Applications
Addressing Heterogeneity in Manycore Applications RTM Simulation Use Case stephane.bihan@caps-entreprise.com Oil&Gas HPC Workshop Rice University, Houston, March 2008 www.caps-entreprise.com Introduction
More informationSONAS Best Practices and options for CIFS Scalability
COMMON INTERNET FILE SYSTEM (CIFS) FILE SERVING...2 MAXIMUM NUMBER OF ACTIVE CONCURRENT CIFS CONNECTIONS...2 SONAS SYSTEM CONFIGURATION...4 SONAS Best Practices and options for CIFS Scalability A guide
More informationIBM Spectrum Scale Archiving Policies
IBM Spectrum Scale Archiving Policies An introduction to GPFS policies for file archiving with Spectrum Archive Enterprise Edition Version 8 (07/31/2017) Nils Haustein Executive IT Specialist EMEA Storage
More informationAUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT
AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT By Joshua Kwedar Sr. Systems Engineer By Steve Horan Cloud Architect ATS Innovation Center, Malvern, PA Dates: Oct December 2017 INTRODUCTION
More informationImplementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment
Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment Brett Bode, Michelle Butler, Sean Stevens, Jim Glasgow National Center for Supercomputing Applications/University
More informationImproved Solutions for I/O Provisioning and Application Acceleration
1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer
More informationDDN s Vision for the Future of Lustre LUG2015 Robert Triendl
DDN s Vision for the Future of Lustre LUG2015 Robert Triendl 3 Topics 1. The Changing Markets for Lustre 2. A Vision for Lustre that isn t Exascale 3. Building Lustre for the Future 4. Peak vs. Operational
More informationData storage services at KEK/CRC -- status and plan
Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The
More informationA GPFS Primer October 2005
A Primer October 2005 Overview This paper describes (General Parallel File System) Version 2, Release 3 for AIX 5L and Linux. It provides an overview of key concepts which should be understood by those
More informationComputation for Beyond Standard Model Physics
Computation for Beyond Standard Model Physics Xiao-Yong Jin Argonne National Laboratory Lattice for BSM Physics 2018 Boulder, Colorado April 6, 2018 2 My PhD years at Columbia Lattice gauge theory Large
More informationGPFS for Life Sciences at NERSC
GPFS for Life Sciences at NERSC A NERSC & JGI collaborative effort Jason Hick, Rei Lee, Ravi Cheema, and Kjiersten Fagnan GPFS User Group meeting May 20, 2015-1 - Overview of Bioinformatics - 2 - A High-level
More informationHPC at UZH: status and plans
HPC at UZH: status and plans Dec. 4, 2013 This presentation s purpose Meet the sysadmin team. Update on what s coming soon in Schroedinger s HW. Review old and new usage policies. Discussion (later on).
More informationHPC future trends from a science perspective
HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively
More informationIBM Spectrum Scale Archiving Policies
IBM Spectrum Scale Archiving Policies An introduction to GPFS policies for file archiving with Linear Tape File System Enterprise Edition Version 4 Nils Haustein Executive IT Specialist EMEA Storage Competence
More informationAFM Migration: The Road To Perdition
AFM Migration: The Road To Perdition Spectrum Scale Users Group UK Meeting 9 th -10 th May 2017 Mark Roberts (AWE) Laurence Horrocks-Barlow (OCF) British Crown Owned Copyright [2017]/AWE GPFS Systems Legacy
More informationARCHER/RDF Overview. How do they fit together? Andy Turner, EPCC
ARCHER/RDF Overview How do they fit together? Andy Turner, EPCC a.turner@epcc.ed.ac.uk www.epcc.ed.ac.uk www.archer.ac.uk Outline ARCHER/RDF Layout Available file systems Compute resources ARCHER Compute
More informationTuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov
Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright njwright @ lbl.gov NERSC- National Energy Research Scientific Computing Center Mission: Accelerate the pace of scientific discovery
More informationDVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011
DVS, GPFS and External Lustre at NERSC How It s Working on Hopper Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011 1 NERSC is the Primary Computing Center for DOE Office of Science NERSC serves
More informationGuillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada
Guillimin HPC Users Meeting February 11, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Compute Canada News Scheduler Updates Software Updates Training
More informationParallel File Systems Compared
Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features
More informationDynamic SIMD Scheduling
Dynamic SIMD Scheduling Florian Wende SC15 MIC Tuning BoF November 18 th, 2015 Zuse Institute Berlin time Dynamic Work Assignment: The Idea Irregular SIMD execution Caused by branching: control flow varies
More informationLeveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands
Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing
More informationBlue Gene/Q. Hardware Overview Michael Stephan. Mitglied der Helmholtz-Gemeinschaft
Blue Gene/Q Hardware Overview 02.02.2015 Michael Stephan Blue Gene/Q: Design goals System-on-Chip (SoC) design Processor comprises both processing cores and network Optimal performance / watt ratio Small
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Simple SPMD etc. Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? Terminology I am badly abusing the term SPMD tough The
More informationHybrid Implementation of 3D Kirchhoff Migration
Hybrid Implementation of 3D Kirchhoff Migration Max Grossman, Mauricio Araya-Polo, Gladys Gonzalez GTC, San Jose March 19, 2013 Agenda 1. Motivation 2. The Problem at Hand 3. Solution Strategy 4. GPU Implementation
More informationSuperMike-II Launch Workshop. System Overview and Allocations
: System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of
More informationALCF Argonne Leadership Computing Facility
ALCF Argonne Leadership Computing Facility ALCF Data Analytics and Visualization Resources William (Bill) Allcock Leadership Computing Facility Argonne Leadership Computing Facility Established 2006. Dedicated
More informationEarly Experiences Writing Performance Portable OpenMP 4 Codes
Early Experiences Writing Performance Portable OpenMP 4 Codes Verónica G. Vergara Larrea Wayne Joubert M. Graham Lopez Oscar Hernandez Oak Ridge National Laboratory Problem statement APU FPGA neuromorphic
More informationPortability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17
Portability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17 11/27/2017 Background Many developers choose OpenMP in hopes of having a single source code that runs effectively anywhere (performance
More informationPerformance of deal.ii on a node
Performance of deal.ii on a node Bruno Turcksin Texas A&M University, Dept. of Mathematics Bruno Turcksin Deal.II on a node 1/37 Outline 1 Introduction 2 Architecture 3 Paralution 4 Other Libraries 5 Conclusions
More informationLessons learned from Lustre file system operation
Lessons learned from Lustre file system operation Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association
More informationStore Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete
Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete 1 DDN Who We Are 2 We Design, Deploy and Optimize Storage Systems Which Solve HPC, Big Data and Cloud Business
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationScalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms
Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms Research Technologies High Performance File Systems hpfs-admin@iu.edu
More informationScalability issues : HPC Applications & Performance Tools
High Performance Computing Systems and Technology Group Scalability issues : HPC Applications & Performance Tools Chiranjib Sur HPC @ India Systems and Technology Lab chiranjib.sur@in.ibm.com Top 500 :
More informationProductive Performance on the Cray XK System Using OpenACC Compilers and Tools
Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid
More informationChallenges in making Lustre systems reliable
Challenges in making Lustre systems reliable Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State Roland of Baden-Württemberg Laifer Challenges and in making Lustre systems reliable
More informationPERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015
PERFORMANCE PORTABILITY WITH OPENACC Jeff Larkin, NVIDIA, November 2015 TWO TYPES OF PORTABILITY FUNCTIONAL PORTABILITY PERFORMANCE PORTABILITY The ability for a single code to run anywhere. The ability
More informationBenoit DELAUNAY Benoit DELAUNAY 1
Benoit DELAUNAY 20091023 Benoit DELAUNAY 1 CC-IN2P3 provides computing and storage for the 4 LHC experiments and many others (astro particles...) A long history of service sharing between experiments Some
More informationDistributed & Heterogeneous Programming in C++ for HPC at SC17
Distributed & Heterogeneous Programming in C++ for HPC at SC17 Michael Wong (Codeplay), Hal Finkel DHPCC++ 2018 1 The Panel 2 Ben Sanders (AMD, HCC, HiP, HSA) Carter Edwards (SNL, Kokkos, ISO C++) CJ Newburn
More informationStorage for HPC, HPDA and Machine Learning (ML)
for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by
More informationACCRE High Performance Compute Cluster
6 중 1 2010-05-16 오후 1:44 Enabling Researcher-Driven Innovation and Exploration Mission / Services Research Publications User Support Education / Outreach A - Z Index Our Mission History Governance Services
More informationEnterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)
Chris Churchey Principal ATS Group, LLC churchey@theatsgroup.com (610-574-0207) October 2014 GPFS with Flash840 on PureFlex and Power8 (AIX & Linux) Why Monitor? (Clusters, Servers, Storage, Net, etc.)
More informationPROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec
PROGRAMOVÁNÍ V C++ CVIČENÍ Michal Brabec PARALLELISM CATEGORIES CPU? SSE Multiprocessor SIMT - GPU 2 / 17 PARALLELISM V C++ Weak support in the language itself, powerful libraries Many different parallelization
More informationFast Forward I/O & Storage
Fast Forward I/O & Storage Eric Barton Lead Architect 1 Department of Energy - Fast Forward Challenge FastForward RFP provided US Government funding for exascale research and development Sponsored by 7
More informationOak Ridge Leadership Computing Facility: Summit and Beyond
Oak Ridge Leadership Computing Facility: Summit and Beyond Justin L. Whitt OLCF-4 Deputy Project Director, Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory March 2017 ORNL is managed
More informationApplication Performance on IME
Application Performance on IME Toine Beckers, DDN Marco Grossi, ICHEC Burst Buffer Designs Introduce fast buffer layer Layer between memory and persistent storage Pre-stage application data Buffer writes
More informationIBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
More informationAddressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer
Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems Ed Hinkel Senior Sales Engineer Agenda Overview - Rogue Wave & TotalView GPU Debugging with TotalView Nvdia CUDA Intel Phi 2
More informationGPU Fundamentals Jeff Larkin November 14, 2016
GPU Fundamentals Jeff Larkin , November 4, 206 Who Am I? 2002 B.S. Computer Science Furman University 2005 M.S. Computer Science UT Knoxville 2002 Graduate Teaching Assistant 2005 Graduate
More informationExperiences in Optimizing a $250K Cluster for High- Performance Computing Applications
Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications Kevin Brandstatter Dan Gordon Jason DiBabbo Ben Walters Alex Ballmer Lauren Ribordy Ioan Raicu Illinois Institute
More informationECE 8823: GPU Architectures. Objectives
ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading
More informationResources Current and Future Systems. Timothy H. Kaiser, Ph.D.
Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic
More informationParallel Programming. Libraries and Implementations
Parallel Programming Libraries and Implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationTACC s Stampede Project: Intel MIC for Simulation and Data-Intensive Computing
TACC s Stampede Project: Intel MIC for Simulation and Data-Intensive Computing Jay Boisseau, Director April 17, 2012 TACC Vision & Strategy Provide the most powerful, capable computing technologies and
More informationOutline. March 5, 2012 CIRMMT - McGill University 2
Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started
More informationCommunity Release Update
Community Release Update LAD 2017 Peter Jones HPDD, Intel OpenSFS Lustre Working Group OpenSFS Lustre Working Group Lead by Peter Jones (Intel) and Dustin Leverman (ORNL) Single forum for all Lustre development
More informationTS7700 Technical Update TS7720 Tape Attach Deep Dive
TS7700 Technical Update TS7720 Tape Attach Deep Dive Ralph Beeston TS7700 Architecture IBM Session objectives Brief Overview TS7700 Quick background of TS7700 TS7720T Overview TS7720T Deep Dive TS7720T
More informationRecall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory
Recall: Address Space Map 13: Memory Management Biggest Virtual Address Stack (Space for local variables etc. For each nested procedure call) Sometimes Reserved for OS Stack Pointer Last Modified: 6/21/2004
More informationDDN About Us Solving Large Enterprise and Web Scale Challenges
1 DDN About Us Solving Large Enterprise and Web Scale Challenges History Founded in 98 World s Largest Private Storage Company Growing, Profitable, Self Funded Headquarters: Santa Clara and Chatsworth,
More informationHitachi Data Instance Manager Software Version Release Notes
Hitachi Data Instance Manager Software Version 4.2.3 Release Notes Contents Contents... 1 About this document... 2 Intended audience... 2 Getting help... 2 About this release... 2 Product package contents...
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationMapping MPI+X Applications to Multi-GPU Architectures
Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under
More information