Using SDSC Systems (part 2)
|
|
- Marianna McCoy
- 5 years ago
- Views:
Transcription
1 Using SDSC Systems (part 2) Running vsmp jobs, Data Transfer, I/O SDSC Summer Institute August Mahidhar Tatineni San Diego Supercomputer Center " 1
2 vsmp Runtime Guidelines: Overview" Identify type of job serial (large memory), threaded (pthreads, openmp), or MPI! Workshop directory has examples for the different scenarios. Hands on section (today and during ScaleMP session later) will walk through different types.! Use affinity in conjunction with automatic process placement utility (numabind).! Optimized MPI (mpich2 tuned for vsmp) is available.!
3 vsmp Guidelines for Threaded Codes" 3
4 Compiling OpenMP Example" Change to the workshop directory:! cd ~/SI12_basics/GORDON_PART2" " Compile using openmp flag:! ifort -o hello_vsmp -openmp hello_vsmp.f90" " Verify executable was created:! ls -lt hello_vsmp" -rwxr-xr-x 1 train61 gue May 9 10:31 hello_vsmp"
5 Hello World on vsmp node (using OpenMP)" hello_vsmp.cmd! #!/bin/bash" #PBS -q vsmp" #PBS -N hello_vsmp" #PBS -l nodes=1:ppn=16:vsmp" #PBS -l walltime=0:10:00" #PBS -o hello_vsmp.out" #PBS -e hello_vsmp.err" #PBS -V" #PBS M username@xyz123.edu" #PBS -m abe" #PBS A gue998" cd ~/SI12_basics/GORDON_PART2" export LD_PRELOAD=/opt/ScaleMP/libvsmpclib/0.1/lib64/libvsmpclib.so" export PATH="/opt/ScaleMP/numabind/bin:$PATH"" export KMP_AFFINITY=compact,verbose,0,`numabind --offset 8`" export OMP_NUM_THREADS=8"./hello_vsmp"
6 Hello World on vsmp node (using OpenMP)" Code written using OpenMP! PROGRAM OMPHELLO INTEGER TNUMBER INTEGER OMP_GET_THREAD_NUM!$OMP PARALLEL DEFAULT(PRIVATE) TNUMBER = OMP_GET_THREAD_NUM() PRINT *, 'HELLO FROM THREAD NUMBER = ', TNUMBER!$OMP END PARALLEL STOP END
7 vsmp OpenMP binding info (from hello_vsmp.err file)" " " " OMP: Info #147: KMP_AFFINITY: Internal thread 0 bound to OS proc set {504}" OMP: Info #147: KMP_AFFINITY: Internal thread 1 bound to OS proc set {505}" OMP: Info #147: KMP_AFFINITY: Internal thread 2 bound to OS proc set {506}" OMP: Info #147: KMP_AFFINITY: Internal thread 3 bound to OS proc set {507}" OMP: Info #147: KMP_AFFINITY: Internal thread 4 bound to OS proc set {508}" OMP: Info #147: KMP_AFFINITY: Internal thread 5 bound to OS proc set {509}" OMP: Info #147: KMP_AFFINITY: Internal thread 7 bound to OS proc set {511}" OMP: Info #147: KMP_AFFINITY: Internal thread 6 bound to OS proc set {510}"
8 Hello World (OpenMP version) Output" HELLO FROM THREAD NUMBER = 1! HELLO FROM THREAD NUMBER = 6! HELLO FROM THREAD NUMBER = 5! HELLO FROM THREAD NUMBER = 4! HELLO FROM THREAD NUMBER = 3! HELLO FROM THREAD NUMBER = 2! HELLO FROM THREAD NUMBER = 0! HELLO FROM THREAD NUMBER = 7! Nodes: gcn-3-11! 8
9 vsmp Pthreads Example" cd ~/SI12_basics/GORDON_PART2! # PATH to numabind! export PATH=/opt/ScaleMP/numabind/bin:$PATH! # ScaleMP preload library that throttles down unnecessary system calls.! export LD_PRELOAD=/opt/ScaleMP/libvsmpclib/0.1/lib64/libvsmpclib.so! # Specify sleep duration for each pthread. Default = 60 sec if not set.! export SLEEP_TIME=30! # 16 pthreads would be created.! NP=16! log=log-$np-`date +%s`.txt!./ptest $NP >> $log 2>&1 &! # Waiting for 15 seconds for all the threads to start.! sleep 15! echo "ptest threads affinity before numabind" >> $log 2>&1! ps -elo pid,lwp,time,ucmd,psr grep ptest >> $log 2>&1! # Start numabind with a config file that has a rule for pthread,! # which would place all threads to consecutive cpus.! numabind --config myconfig >> $log 2>&1! echo "ptest threads affinity after numabind" >> $log 2>&1! ps -elo pid,lwp,time,ucmd,psr grep ptest >> $log 2>&1! sleep 300!! 9
10 Data Transfer (scp, globus-url-copy)" scp is o.k. to use for simple file transfers and small file sizes (<1GB). Example:! $ scp w.txt train40@gordon.sdsc.edu:/home/train40/w.txt 100% 15KB 14.6KB/s 00:00 " globus-url-copy for large scale data transfers between XD resources (and local machines w/ a globus client).! Uses your XSEDE-wide username and password " Retrieves your certificate proxies from the central server" Highest performance between XSEDE sites, uses striping across multiple servers and multiple threads on each server." 10
11 Data Transfer globus-url-copy" Step 1: Retrieve certificate proxies:! $ module load globus" $ myproxy-logon l xsedeusername" Enter MyProxy pass phrase:" A credential has been received for user xsedeusername in /tmp/ x509up_u " " Step 2: Initiate globus-url-copy:! $ globus-url-copy -vb -stripe -tcp-bs 16m -p 4 gsiftp:// gridftp.ranger.tacc.teragrid.org:2811///scratch/00342/username/test.tar gsiftp:// trestles-dm2.sdsc.xsede.org:2811///oasis/scratch/username/temp_project/testgordon.tar" Source: gsiftp://gridftp.ranger.tacc.teragrid.org:2811///scratch/00342/username/" Dest: gsiftp://trestles-dm2.sdsc.xsede.org:2811///oasis/scratch/username/ temp_project/" test.tar -> test-gordon.tar" 11
12 Data Transfer Globus Online" Works from Windows/Linux/Mac via globus online website:! Gordon and Trestles endpoints already exist. Authentication can be done iusing XSEDE-wide username and password.! Globus Connect application (available for Windows/Linux/Mac can turn your laptop/ desktop into an endpoint.! 12
13 Data Transfer Globus Online" Step 1: Create a globus online account! 13
14 Data Transfer Globus Online" Step 2: Set up local machine as endpoint using Globus Connect.! 14
15 Data Transfer Globus Online" 15
16 Data Transfer Globus Online" Step 3: Pick Endpoints and Initiate Transfers!! 16
17 Data Transfer Globus Online" 17
18 Gordon : Filesystems" Lustre filesystems Good for scalable large block I/O! Accessible from both native and vsmp nodes." /oasis/scratch/gordon 1.6 PB, peak measured performance ~50GB/s on reads and writes." /oasis/projects ~ 400TB" SSD filesystems! /scratch local to each native compute node 300 GB each." /scratch on vsmp node 4.8TB of SSD based filesystem." NFS filesystems (/home)! 18
19 Gordon Network Architecture" XSEDE & R&E Networks Data Movers (4x) Mgmt. Nodes (2x) SDSC Network Mgmt. Edge & Core Ethernet Public Edge & Core Ethernet Login Nodes (4x) NFS Server (2x) Dual- rail IB Dual 10GbE storage GbE management GbE public Round robin login Mirrored NFS Redundant front- end IO Nodes Compute Node Compute Node Compute Node Compute Node 1,024 Data Oasis Lustre PFS 4 PB IO Nodes 64 3D torus: rail 1 3D torus: rail 2 GbE! 2x10GbE" 10GbE! QDR 40 Gb/s!
20 Gordon 3D Torus Interconnect Fabric 4x4x4 3D Torus Topology" 4X4X4 Mesh! Ends are folded on all three! Dimensions to form a 3DTorus" Dual-Rail Network! increased Bandwidth & Redundancy! 48GB/sec Single Connection to each Network! 16 Compute Nodes, 2 IO Nodes! 18 x 4X IB Network Connections 18 x 4X IB Network Connections 48GB/sec 36 Port Fabric Switch IO IO 36 Port Fabric Switch CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN
21 Data Oasis Heterogeneous Architecture Lustre-based Parallel File System" TRESTLES IB cluster GORDON IB cluster TRITON Myrinet cluster Mellanox 5020 Bridge 12 GB/s 64 Lustre LNET Routers 100 GB/s Myrinet 10G Switch 25 GB/s 3 DisUnct Network Architectures MDS MDS Arista G Arista G Redundant Switches for Reliability and Performance MDS Metadata Servers OSS 72TB OSS 72TB OSS 72TB OSS 72TB 64 OSS (Object Storage Servers) Provide 100GB/s Performance and >4PB Raw Capacity JBOD 90TB JBOD 90TB JBOD 90TB JBOD 90TB JBODs (Just a Bunch Of Disks) Provide Capacity Scale- out to an AddiVonal 5.8PB
22 Data Oasis from Gordon Itʼs the Routers!!" Gordon has 64 I/O nodes which host the flash and also serve as routers for the lustre filesystems.! Lustre clients configured to use the local I/O node if available. This maximizes the overall write performance on the system.! Reads round robin over the available routers.! Workshop examples illustrate the locality of the write operations.! 22
23 Lustre Examples" Two example scripts in the ~/SI12_basics/ GORDON_PART2 directory! IOR_lustre_0_hops.cmd Runs jobs with all nodes on one switch." IOR_lustre_4_hops.cmd Runs jobs with nodes up to 4 hops away." Example output! ior_maxhops0.out All nodes on same switch and hence use only *one* router. Max Write MB/s." Ior_maxhops4.out The nodes ended up on two switches and hence we had two routers in play during the write. Max Write MB/s. " 23
24 Data Oasis Performance"
25 Model A: One SSD per Compute Node (only 4 of 16 compute nodes shown)" Lustre" Compute Node" Compute Node" Compute Node" One 300 GB flash drive exported to each compute node appears as a local file system " Lustre parallel file system is mounted identically on all nodes." " Use cases:" Applications that need local, temporary scratch" Gaussian" Abaqus" Hadoop" Compute Node" Logical View! File system appears as:! /scratch/$user/$pbs_jobid!
26 Using SSD Scratch (Native Nodes)" #!/bin/bash! #PBS -q normal! #PBS -N ior_native! #PBS -l nodes=1:ppn=16:native! #PBS -l walltime=00:25:00! #PBS -o ior_scratch_native.out! #PBS -e ior_scratch_native.err! #PBS -V! #PBS M username@xyz123.edu! #PBS -m abe! #PBS A gue998!! cd /scratch/$user/$pbs_jobid!! mpirun_rsh -hostfile $PBS_NODEFILE -np 4 ~/SI12_basics/GORDON_PART2/IOR-gordon -i 1 -F b 16g -t 1m -v -v > IOR_native_scratch.log!! cp /scratch/$user/$pbs_jobid/ior_native_scratch.log ~/SI12_basics/GORDON_PART2/!
27 Using SSD Scratch (Native Nodes)" Snapshot on the node during the run:! $ pwd" /scratch/mahidhar/72251.gordon-fe2.local" $ ls -lt" total " -rw-r--r-- 1 mahidhar hpss May 15 23:48 testfile " -rw-r--r-- 1 mahidhar hpss May 15 23:48 testfile " -rw-r--r-- 1 mahidhar hpss May 15 23:48 testfile " -rw-r--r-- 1 mahidhar hpss May 15 23:48 testfile " -rw-r--r-- 1 mahidhar hpss 1101 May 15 23:48 IOR_native_scratch.log" Performance from single node (in log file copied back):! Max Write: MiB/sec ( MB/sec)" Max Read: MiB/sec ( MB/sec)" 27
28 IOPS SSD vs Lustre" FIO benchmark used to measure random I/O performance! Sample scripts! scratch_native_fio.cmd (uses SSDs)" lustre_native_fio.cmd Note: we will not run this today! This will overload the meta data server if there are too many simultaneous jobs with lots of random I/O requests. Output from a test run is in ior_lustre_native_fio.out to illustrate the low IOPs." Sample performance numbers:! SSD Random Write : iops=4782, Random Read: 13738" Lustre Random Write: iops=671, Random Read: iops=101 " 28
29 Which I/O system is right for my application?" Performance" Infrastructure" Persistence" Capacity" Use cases" Flash-based I/O nodes! SSDʼs support low latency I/O, high IOPS, and high bandwidth. One SSD can deliver 37K IOPS." Flash resources are dedicated to the user and performance is largely independent of what other users are doing on the system." SSDʼs are deployed in I/O nodes using iser, an RDMA protocol that is accessed over the InfiniBand network." Data is generally removed at the end of a run so the resource can be made available to the next job." Up to 4.8 TB per users depending on configuration" Local application scratch (Abaqus, Gaussian); as a data mining platform (e.g., Hadoop); graph problems;" Lustre! Lustre is ubiquitous in HPC. It does well for sequential I/O and files that support I/O to a few files from many cores simultaneously. Random I/O is a Lustre killer." Lustre is a shared resource and performance will vary depending on what other users are doing." 64 OSSʼs; distinct file systems and metadata servers; accessed over a 10GbE network via the I/O nodes. Hundreds of HDDs/spindles." Most is deployed as scratch and purgeable by policy (not necessarily at the end of the job." Some deployed as a persistent project storage resource." No specific limits or quotas imposed on scratch. File system is ~ 2 PB." Traditional HPC I/O associated with MPI applications. Prestaging of data that will be pulled into flash."
30 Model B: 16 SSDʼs for 1 Compute Node" Lustre" Compute Node" 4.8 TB" 16 SSDʼs in a RAID0 appear as a single 4.8 TB file system to the compute node." Flash I/O and Lustre traffic uses Rail 1 of the torus." " Use cases:" Database" Data mining" Gaussian" Logical View! File system appears as:! /scratch/$user/$pbs_jobid!
31 Model B: 16 SSDʼs for 1 Compute Node" We have 4 nodes in rack 18 set up under this model gcn-18-11, gcn-18-31,gcn-18-51, and gcn " We have reserved nodes gcn-18-51, gcn for summer institute users who wish to use this model. Users can directly request the nodes (example below)" #!/bin/bash! #PBS -q normal! #PBS -N ior_native! #PBS -l nodes=gcn-18-51:ppn=16:native! #PBS -l walltime=00:25:00! #PBS -o ior_scratch_native.out! #PBS -e ior_scratch_native.err! #PBS -V! #PBS -M mahidhar@sdsc.edu! #PBS -m abe! #PBS -A use300! cd /scratch/$user/$pbs_jobid!!
32 Model C: 16 SSDʼs within a vsmp Supernode" Lustre"! " 16 node" Virtual Compute Image" (1 TB)" Lustre not part of supernode" Logical View!!! 4.8 TB file system"! File system appears as:! /scratch1/$user/$pbs_jobid! (/scratch2 available if using a 32-node supernode)! 4.8 TB flash as a single XFS file system" Flash I/O uses both rail 0 and rail 1" " Use cases:" Serial and threaded applications that need large memory and local disk" Abaqus" Genomics (Velvet, Allpaths, etc)" "
33 Model C: 16 SSDʼs within a vsmp Supernode" We have reserved vsmp nodes for summer institute users who wish to use this model. Users can directly request the nodes (example below)" #!/bin/bash! #PBS q vsmp! #PBS -N ior_vsmp! #PBS -l nodes=1:ppn=16:vsmp! #PBS -l walltime=00:25:00! #PBS -o ior_scratch_vsmp.out! #PBS -e ior_scratch_vsmp.err! #PBS -V! #PBS M username@xyz123.edu! #PBS -m abe! #PBS A gue998!! cd /scratch1/$user/$pbs_jobid!!
34 Summary, Q/A "" Follow guidelines for serial, OpenMP, Pthreads, MPI jobs on the vsmp nodes.! Access options ssh clients, XSEDE User Portal! Data Transfer options scp, globus-url-copy (gridftp), globus online, and XSEDE User Portal File Manager.! Lustre routed over I/O nodes. Write performance determined by number of routers used by a job.! Use SSD local scratch where possible. Excellent for codes like Gaussian, Abaqus.! 34
High Performance Computing and Data Resources at SDSC
High Performance Computing and Data Resources at SDSC "! Mahidhar Tatineni (mahidhar@sdsc.edu)! SDSC Summer Institute! August 05, 2013! HPC Resources at SDSC Hardware Overview HPC Systems : Gordon, Trestles
More informationHPC Systems Overview. SDSC Summer Institute August 6-10, 2012 San Diego, CA. Shawn Strande Gordon Project Manager SAN DIEGO SUPERCOMPUTER CENTER
HPC Systems Overview SDSC Summer Institute August 6-10, 2012 San Diego, CA Shawn Strande Gordon Project Manager Trestles High Productivity System Targeted at modest scale jobs and Science Gateways Appro
More informationPractical Introduction to
1 2 Outline of the workshop Practical Introduction to What is ScaleMP? When do we need it? How do we run codes on the ScaleMP node on the ScaleMP Guillimin cluster? How to run programs efficiently on ScaleMP?
More informationGordon - Design and Performance of a 3D Torus Interconnect for Data Intensive Computing
Gordon - Design and Performance of a 3D Torus Interconnect for Data Intensive Computing HPC Advisory Council Held in Conjunction with ISC 12 June 17, 2012 Hamburg, Germany Shawn Strande Gordon Project
More informationData Movement and Storage. 04/07/09 1
Data Movement and Storage 04/07/09 www.cac.cornell.edu 1 Data Location, Storage, Sharing and Movement Four of the seven main challenges of Data Intensive Computing, according to SC06. (Other three: viewing,
More informationGordon: Design, Performance, & Experiences Deploying & Supporting a Data-Intensive Supercomputer
Gordon: Design, Performance, & Experiences Deploying & Supporting a Data-Intensive Supercomputer XSEDE 12 July 16-19, 2012 Chicago, IL Shawn Strande Gordon Project Manager San Diego Supercomputer Center
More informationFeedback on BeeGFS. A Parallel File System for High Performance Computing
Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December
More informationAn Overview of Fujitsu s Lustre Based File System
An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationGateways to Discovery: Cyberinfrastructure for the Long Tail of Science
Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.
More informationFuture of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD
Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap
More informationComet Virtualization Code & Design Sprint
Comet Virtualization Code & Design Sprint SDSC September 23-24 Rick Wagner San Diego Supercomputer Center Meeting Goals Build personal connections between the IU and SDSC members of the Comet team working
More informationComputer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research
Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya
More informationExperiences with HP SFS / Lustre in HPC Production
Experiences with HP SFS / Lustre in HPC Production Computing Centre (SSCK) University of Karlsruhe Laifer@rz.uni-karlsruhe.de page 1 Outline» What is HP StorageWorks Scalable File Share (HP SFS)? A Lustre
More informationXSEDE New User Training. Ritu Arora November 14, 2014
XSEDE New User Training Ritu Arora Email: rauta@tacc.utexas.edu November 14, 2014 1 Objectives Provide a brief overview of XSEDE Computational, Visualization and Storage Resources Extended Collaborative
More informationLustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE
Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2015 Our Environment Today Your laptops or workstations: only used for portal access Blue Waters
More informationTriton file systems - an introduction. slide 1 of 28
Triton file systems - an introduction slide 1 of 28 File systems Motivation & basic concepts Storage locations Basic flow of IO Do's and Don'ts Exercises slide 2 of 28 File systems: Motivation Case #1:
More informationThe JANUS Computing Environment
Research Computing UNIVERSITY OF COLORADO The JANUS Computing Environment Monte Lunacek monte.lunacek@colorado.edu rc-help@colorado.edu What is JANUS? November, 2011 1,368 Compute nodes 16,416 processors
More informationABySS Performance Benchmark and Profiling. May 2010
ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1
ZEST Snapshot Service A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1 Design Motivation To optimize science utilization of the machine Maximize
More informationTechnical Computing Suite supporting the hybrid system
Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect
More informationIsilon Performance. Name
1 Isilon Performance Name 2 Agenda Architecture Overview Next Generation Hardware Performance Caching Performance Streaming Reads Performance Tuning OneFS Architecture Overview Copyright 2014 EMC Corporation.
More informationHow to Use a Supercomputer - A Boot Camp
How to Use a Supercomputer - A Boot Camp Shelley Knuth Peter Ruprecht shelley.knuth@colorado.edu peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Today we will discuss: Who Research Computing is
More informationSDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication
SDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication Rick Wagner HPC Systems Manager San Diego Supercomputer Center Comet HPC for the long tail of science iphone panorama photograph of 1 of 2 server rows
More informationBig Data Analytics with the OSU HiBD Stack at SDSC. Mahidhar Tatineni OSU Booth Talk, SC18, Dallas
Big Data Analytics with the OSU HiBD Stack at SDSC Mahidhar Tatineni OSU Booth Talk, SC18, Dallas Comet HPC for the long tail of science iphone panorama photograph of 1 of 2 server rows Comet: System Characteristics
More informationHPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing
HPC File Systems and Storage Irena Johnson University of Notre Dame Center for Research Computing HPC (High Performance Computing) Aggregating computer power for higher performance than that of a typical
More informationA New NSF TeraGrid Resource for Data-Intensive Science
A New NSF TeraGrid Resource for Data-Intensive Science Michael L. Norman Principal Investigator Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist Slide 1 Coping with the data deluge
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationData storage on Triton: an introduction
Motivation Data storage on Triton: an introduction How storage is organized in Triton How to optimize IO Do's and Don'ts Exercises slide 1 of 33 Data storage: Motivation Program speed isn t just about
More informationSurFS Product Description
SurFS Product Description 1. ABSTRACT SurFS An innovative technology is evolving the distributed storage ecosystem. SurFS is designed for cloud storage with extreme performance at a price that is significantly
More informationFUJITSU PHI Turnkey Solution
FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture
More informationLAB. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers
LAB Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers Dan Stanzione, Lars Koesterke, Bill Barth, Kent Milfeld dan/lars/bbarth/milfeld@tacc.utexas.edu XSEDE 12 July 16, 2012 1 Discovery
More informationUAntwerpen, 24 June 2016
Tier-1b Info Session UAntwerpen, 24 June 2016 VSC HPC environment Tier - 0 47 PF Tier -1 623 TF Tier -2 510 Tf 16,240 CPU cores 128/256 GB memory/node IB EDR interconnect Tier -3 HOPPER/TURING STEVIN THINKING/CEREBRO
More informationIntroduction to HPC Resources and Linux
Introduction to HPC Resources and Linux Burak Himmetoglu Enterprise Technology Services & Center for Scientific Computing e-mail: bhimmetoglu@ucsb.edu Paul Weakliem California Nanosystems Institute & Center
More informationCSCS HPC storage. Hussein N. Harake
CSCS HPC storage Hussein N. Harake Points to Cover - XE6 External Storage (DDN SFA10K, SRP, QDR) - PCI-E SSD Technology - RamSan 620 Technology XE6 External Storage - Installed Q4 2010 - In Production
More informationHPC NETWORKING IN THE REAL WORLD
15 th ANNUAL WORKSHOP 2019 HPC NETWORKING IN THE REAL WORLD Jesse Martinez Los Alamos National Laboratory March 19 th, 2019 [ LOGO HERE ] LA-UR-19-22146 ABSTRACT Introduction to LANL High Speed Networking
More informationPerformance of Applications on Comet GPU Nodes Utilizing MVAPICH2-GDR. Mahidhar Tatineni MVAPICH User Group Meeting August 16, 2017
Performance of Applications on Comet GPU Nodes Utilizing MVAPICH2-GDR Mahidhar Tatineni MVAPICH User Group Meeting August 16, 2017 This work supported by the National Science Foundation, award ACI-1341698.
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationThe Spider Center-Wide File System
The Spider Center-Wide File System Presented by Feiyi Wang (Ph.D.) Technology Integration Group National Center of Computational Sciences Galen Shipman (Group Lead) Dave Dillow, Sarp Oral, James Simmons,
More information1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K
1. ALMA Pipeline Cluster specification The following document describes the recommended hardware for the Chilean based cluster for the ALMA pipeline and local post processing to support early science and
More informationSuperMike-II Launch Workshop. System Overview and Allocations
: System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of
More informationA ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC
A ClusterStor update Torben Kling Petersen, PhD Principal Architect, HPC Sonexion (ClusterStor) STILL the fastest file system on the planet!!!! Total system throughput in excess on 1.1 TB/s!! 2 Software
More informationDVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011
DVS, GPFS and External Lustre at NERSC How It s Working on Hopper Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011 1 NERSC is the Primary Computing Center for DOE Office of Science NERSC serves
More informationApplication Acceleration Beyond Flash Storage
Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage
More informationXSEDE New User Tutorial
April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to
More informationNetApp High-Performance Storage Solution for Lustre
Technical Report NetApp High-Performance Storage Solution for Lustre Solution Design Narjit Chadha, NetApp October 2014 TR-4345-DESIGN Abstract The NetApp High-Performance Storage Solution (HPSS) for Lustre,
More informationImplementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment
Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment Brett Bode, Michelle Butler, Sean Stevens, Jim Glasgow National Center for Supercomputing Applications/University
More informationInfiniBand Networked Flash Storage
InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationEXPERIENCES WITH NVME OVER FABRICS
13th ANNUAL WORKSHOP 2017 EXPERIENCES WITH NVME OVER FABRICS Parav Pandit, Oren Duer, Max Gurtovoy Mellanox Technologies [ 31 March, 2017 ] BACKGROUND: NVME TECHNOLOGY Optimized for flash and next-gen
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationData Management. Parallel Filesystems. Dr David Henty HPC Training and Support
Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER
More informationIntel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John
More informationParallel File Systems for HPC
Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical
More informationSami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1
Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var
More informationVoltaire Making Applications Run Faster
Voltaire Making Applications Run Faster Asaf Somekh Director, Marketing Voltaire, Inc. Agenda HPC Trends InfiniBand Voltaire Grid Backbone Deployment examples About Voltaire HPC Trends Clusters are the
More informationSSD Architecture Considerations for a Spectrum of Enterprise Applications. Alan Fitzgerald, VP and CTO SMART Modular Technologies
SSD Architecture Considerations for a Spectrum of Enterprise Applications Alan Fitzgerald, VP and CTO SMART Modular Technologies Introduction Today s SSD delivers form-fit-function compatible solid-state
More informationData Staging: Moving large amounts of data around, and moving it close to compute resources
Data Staging: Moving large amounts of data around, and moving it close to compute resources PRACE advanced training course on Data Staging and Data Movement Helsinki, September 10 th 2013 Claudio Cacciari
More informationNew User Seminar: Part 2 (best practices)
New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency
More informationCerebro Quick Start Guide
Cerebro Quick Start Guide Overview of the system Cerebro consists of a total of 64 Ivy Bridge processors E5-4650 v2 with 10 cores each, 14 TB of memory and 24 TB of local disk. Table 1 shows the hardware
More informationEmerging Technologies for HPC Storage
Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional
More informationEfficient Object Storage Journaling in a Distributed Parallel File System
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin FAST 10, Feb 25, 2010 A
More informationIntroduction to High-Performance Computing (HPC)
Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid
More informationChoosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing
Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational
More informationNAMD Performance Benchmark and Profiling. January 2015
NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationParallel File Systems Compared
Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features
More informationChelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING
Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity
More informationComputing with the Moore Cluster
Computing with the Moore Cluster Edward Walter An overview of data management and job processing in the Moore compute cluster. Overview Getting access to the cluster Data management Submitting jobs (MPI
More informationIntroduc)on to Hyades
Introduc)on to Hyades Shawfeng Dong Department of Astronomy & Astrophysics, UCSSC Hyades 1 Hardware Architecture 2 Accessing Hyades 3 Compu)ng Environment 4 Compiling Codes 5 Running Jobs 6 Visualiza)on
More informationUsing file systems at HC3
Using file systems at HC3 Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu Basic Lustre
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationData Staging: Moving large amounts of data around, and moving it close to compute resources
Data Staging: Moving large amounts of data around, and moving it close to compute resources Digital Preserva-on Advanced Prac--oner Course Glasgow, July 19 th 2013 c.cacciari@cineca.it Definition Starting
More informationNew Storage Architectures
New Storage Architectures OpenFabrics Software User Group Workshop Replacing LNET routers with IB routers #OFSUserGroup Lustre Basics Lustre is a clustered file-system for supercomputing Architecture consists
More informationHPC Input/Output. I/O and Darshan. Cristian Simarro User Support Section
HPC Input/Output I/O and Darshan Cristian Simarro Cristian.Simarro@ecmwf.int User Support Section Index Lustre summary HPC I/O Different I/O methods Darshan Introduction Goals Considerations How to use
More informationHPC Storage Use Cases & Future Trends
Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively
More informationOur new HPC-Cluster An overview
Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization
More informationMVAPICH MPI and Open MPI
CHAPTER 6 The following sections appear in this chapter: Introduction, page 6-1 Initial Setup, page 6-2 Configure SSH, page 6-2 Edit Environment Variables, page 6-5 Perform MPI Bandwidth Test, page 6-8
More informationNCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017
NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017 Overview The Globally Accessible Data Environment (GLADE) provides centralized file storage for HPC computational, data-analysis,
More informationLustre at Scale The LLNL Way
Lustre at Scale The LLNL Way D. Marc Stearman Lustre Administration Lead Livermore uting - LLNL This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory
More informationCrossing the Chasm: Sneaking a parallel file system into Hadoop
Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large
More informationFhGFS - Performance at the maximum
FhGFS - Performance at the maximum http://www.fhgfs.com January 22, 2013 Contents 1. Introduction 2 2. Environment 2 3. Benchmark specifications and results 3 3.1. Multi-stream throughput................................
More informationArchitecting Storage for Semiconductor Design: Manufacturing Preparation
White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data
More informationSMB Direct Update. Tom Talpey and Greg Kramer Microsoft Storage Developer Conference. Microsoft Corporation. All Rights Reserved.
SMB Direct Update Tom Talpey and Greg Kramer Microsoft 1 Outline Part I Ecosystem status and updates SMB 3.02 status SMB Direct applications RDMA protocols and networks Part II SMB Direct details Protocol
More informationGPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009
GPFS on a Cray XT Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009 Outline NERSC Global File System GPFS Overview Comparison of Lustre and GPFS
More informationLCE: Lustre at CEA. Stéphane Thiell CEA/DAM
LCE: Lustre at CEA Stéphane Thiell CEA/DAM (stephane.thiell@cea.fr) 1 Lustre at CEA: Outline Lustre at CEA updates (2009) Open Computing Center (CCRT) updates CARRIOCAS (Lustre over WAN) project 2009-2010
More informationHPC Architectures. Types of resource currently in use
HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationBenefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies
Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies Storage Transitions Change Network Needs Software Defined Storage Flash Storage Storage
More informationHPE Scalable Storage with Intel Enterprise Edition for Lustre*
HPE Scalable Storage with Intel Enterprise Edition for Lustre* HPE Scalable Storage with Intel Enterprise Edition For Lustre* High Performance Storage Solution Meets Demanding I/O requirements Performance
More informationSTAR-CCM+ Performance Benchmark and Profiling. July 2014
STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute
More informationNext-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads
Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads Liran Zvibel CEO, Co-founder WekaIO @liranzvibel 1 WekaIO Matrix: Full-featured and Flexible Public or Private S3 Compatible
More informationTo Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC
To Infiniband or Not Infiniband, One Site s s Perspective Steve Woods MCNC 1 Agenda Infiniband background Current configuration Base Performance Application performance experience Future Conclusions 2
More informationDELL Terascala HPC Storage Solution (DT-HSS2)
DELL Terascala HPC Storage Solution (DT-HSS2) A Dell Technical White Paper Dell Li Ou, Scott Collier Terascala Rick Friedman Dell HPC Solutions Engineering THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES
More informationAdvanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED
Advanced Software for the Supercomputer PRIMEHPC FX10 System Configuration of PRIMEHPC FX10 nodes Login Compilation Job submission 6D mesh/torus Interconnect Local file system (Temporary area occupied
More informationIntroducing Panasas ActiveStor 14
Introducing Panasas ActiveStor 14 SUPERIOR PERFORMANCE FOR MIXED FILE SIZE ENVIRONMENTS DEREK BURKE, PANASAS EUROPE INTRODUCTION TO PANASAS Storage that accelerates the world s highest performance and
More informationGraham vs legacy systems
New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet
More informationThe cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group
The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing
More informationDDN s Vision for the Future of Lustre LUG2015 Robert Triendl
DDN s Vision for the Future of Lustre LUG2015 Robert Triendl 3 Topics 1. The Changing Markets for Lustre 2. A Vision for Lustre that isn t Exascale 3. Building Lustre for the Future 4. Peak vs. Operational
More informationMAHA. - Supercomputing System for Bioinformatics
MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3
More information