Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

Size: px
Start display at page:

Download "Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada"

Transcription

1 Guillimin HPC Users Meeting February 11, 2016 McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

2 Compute Canada News Scheduler Updates Software Updates Training News Special Topic How cold is your data? Cold data and Storage Management on Guillimin Outline 2

3 Compute Canada News RAC and RPP 2016 Competition Results Implementation of new allocation settings for compute and storage has been set in January New: most 2015 CPU allocations are updated with the same RAPI (xyz-123-ab) for 2016 Storage quotas for very large allocations are increased progressively We have already started to decrease some quotas for groups that have a reduced allocation from 2015 to 2016, or have no specific storage allocation for

4 System Status IB network and GPFS issues January 28: InfiniBand hardware issue in one switch Cascade of GPFS communication issues Lots of failed jobs; we temporarily paused the scheduler February 4: InfiniBand network structure (fabric) issue GPFS partitions lost on all nodes Needed a complete restart of IB network and GPFS: we temporarily pause the scheduler, drain compute nodes and drain all login nodes February 9: all VMs and license servers back online 4

5 Scheduler Update The scheduler is mostly stable. Only occasionally it hangs or crashes. Monitoring scripts are in place to restart the server in case of unresponsiveness: please wait 10 minutes if qsub, qstat, etc., fail. Other monitoring scripts automatically take nodes offline in case of issues, including every time a new job is about to start. If the node has issues (GPFS, disk full, RAM, etc.) the job will run elsewhere. 5

6 Software Update New Installations PGI/15.10 (via Lmod only) GAMESS-US/ R1 (via Lmod only) New new Lmod/EasyBuild based module structure: Now backwards compatible, opt-in by doing: touch ~/.lmod_legacy Default on March 15, opt-out via ~/.lmod_disabled Old modulefiles keep working, including those in $HOME/modulefiles Most new modulefiles accessed via: module load iomkl/2015b (loads GCC Intel OpenMPI MKL); see 6

7 Training News See Training and Outreach at for our calendar of training and workshops for 2016 and to register All materials from previous workshops are available online. See also: Suggestions for training? Please let us know! Upcoming: calculquebec.eventbrite.ca February 15 - Introduction à Linux et au CIP (U. Montréal) February 17 - Introduction to Python (McGill U.) February 22 - Introduction à MPI (U. Montréal) Recently Completed: February 3 - Introduction to Linux (McGill U.) January 28 - Introduction to ARC (McGill U.) 7

8 User Feedback and Discussion Questions? Comments? We value your feedback. Contact us at: Guillimin Operational News for Users Status Pages (all CQ systems) Follow us on Twitter 8

9 How cold is your data? Cold data and Storage Management on Guillimin February 11, 2016 McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

10 How cold is your data? Outline: What is cold data? Why is this an issue? How can we deal with it? 10

11 What is cold data? Cold data: any file not accessed for a long time. Access: any read or write operation On Guillimin: We scan the project spaces on all file systems to identify the time of last access and size for all files Unfortunately, no, we cannot cool Guillimin with cold data (but that would be great!, especially in summer) So, yes, we have to cool disks containing cold data But, no, cooling disks is not causing cold data ;-) 11

12 Distribution of cold data TB 12

13 Distribution of cold data TB 13

14 Cold Data Problem - Solution Why is this an issue? RAC requests far exceed capacity /sb: Capacity = 456 TB Allocations + home dir. = 546 TB (120%) /gs: Capacity = 3097 TB Allocations on /gs + scratch dir. = 4207 TB (135%) Not all data has the same temperature! How can we deal with it? Move files between different tiers (classes) of storage so as to provide disk space for active data 14

15 HSM Hierarchical Storage Management What it does: While keeping a single view of the file system (GPFS), it will physically move data across multiple tiers: SSDs (not on Guillimin) Disks (/sb, /gs) Tapes Stub Files (depending on implemented policies): First 1MB of data is kept on first tier (disk) The remaining data is moved to the next tier (tape) Metadata remain accessible from the first tier (disk) 15

16 What does it mean? From an administration and operations perspective project spaces are scanned to identify the size and time of last access for all files Initial selection criteria: Access time and modification time older than 1 year Only files that are larger than 10MB Selected data blocks are moved to the tape system From {/gs, /sb} to a hard-drive buffer, then to tape From a user s perspective File read and write access is blocking The terminal or any job process hangs temporarily By that time: the data recall is queued, the robot puts the tape in a tape-drive, the drive seeks to the data, and the data is moved back to disk 16

17 Tests that have been done First test: simultaneously recall five 4GB files a) First read action (diff) has waited 2 to 3 minutes b) Reading all five files (start time) with Python code: File 1: delay 0s (because already on GPFS from test a)) File 2: delay 42s File 3: delay 52s File 4: delay 15s (sequence: files 1, 4, 2, 3, 5) File 5: delay 60s 17

18 Tests that have been done Second test: simultaneously recall eight 4GB files With Python code (test was started at second 0): File 6: from second 122 to 128 File 1: from second 155 to 161 File 2: from second 166 to 172 File 8: from second 185 to 191 File 7: from second 205 to 211 File 5: from second 219 to 225 File 4: from second 241 to 247 File 3: from second 262 to 268 Good news: no timeout error message! 18

19 Tests that have been done Third test: simultaneously recall eight 4GB files With C code (test was started at second 0): File 1: second 31 to 33 File 7: second 44 to 45 File 3: second 65 to 67 File 5: second 73 to 74 File 8: second 83 to 86 File 6: second 91 to 92 File 4: second 114 to 115 File 2: second 132 to 134 C vs Python: when all files are on disk, C code reads them in 17 seconds, and Python code, in 18 seconds 19

20 Current status of deployment Status: All project spaces have been scanned for candidates A few selected project spaces have been partly migrated Still fine tuning and testing further the migration system New prquota in development Will report data usage both on disk and on tape The total permitted usage will be equal to the allocation size The disk quota will be set dynamically according to the amount of data on tape 20

21 Conclusion For any other questions: 21

Guillimin HPC Users Meeting June 16, 2016

Guillimin HPC Users Meeting June 16, 2016 Guillimin HPC Users Meeting June 16, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Compute Canada News System Status Software Updates Training News

More information

Guillimin HPC Users Meeting March 17, 2016

Guillimin HPC Users Meeting March 17, 2016 Guillimin HPC Users Meeting March 17, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News System Status Software Updates Training

More information

Guillimin HPC Users Meeting November 16, 2017

Guillimin HPC Users Meeting November 16, 2017 Guillimin HPC Users Meeting November 16, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting. Bryan Caron July 17, 2014 Bryan Caron bryan.caron@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

Guillimin HPC Users Meeting July 14, 2016

Guillimin HPC Users Meeting July 14, 2016 Guillimin HPC Users Meeting July 14, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News System Status Software Updates Training

More information

Guillimin HPC Users Meeting. Bart Oldeman

Guillimin HPC Users Meeting. Bart Oldeman June 19, 2014 Bart Oldeman bart.oldeman@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

Guillimin HPC Users Meeting October 20, 2016

Guillimin HPC Users Meeting October 20, 2016 Guillimin HPC Users Meeting October 20, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Guillimin HPC Users Meeting December 14, 2017

Guillimin HPC Users Meeting December 14, 2017 Guillimin HPC Users Meeting December 14, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Guillimin HPC Users Meeting January 13, 2017

Guillimin HPC Users Meeting January 13, 2017 Guillimin HPC Users Meeting January 13, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Guillimin HPC Users Meeting April 13, 2017

Guillimin HPC Users Meeting April 13, 2017 Guillimin HPC Users Meeting April 13, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit to

More information

Guillimin HPC Users Meeting March 16, 2017

Guillimin HPC Users Meeting March 16, 2017 Guillimin HPC Users Meeting March 16, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit to

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

Guillimin HPC Users Meeting

Guillimin HPC Users Meeting Guillimin HPC Users Meeting July 16, 2015 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Storage Updates Software Updates Training

More information

Genius Quick Start Guide

Genius Quick Start Guide Genius Quick Start Guide Overview of the system Genius consists of a total of 116 nodes with 2 Skylake Xeon Gold 6140 processors. Each with 18 cores, at least 192GB of memory and 800 GB of local SSD disk.

More information

UAntwerpen, 24 June 2016

UAntwerpen, 24 June 2016 Tier-1b Info Session UAntwerpen, 24 June 2016 VSC HPC environment Tier - 0 47 PF Tier -1 623 TF Tier -2 510 Tf 16,240 CPU cores 128/256 GB memory/node IB EDR interconnect Tier -3 HOPPER/TURING STEVIN THINKING/CEREBRO

More information

Parallel File Systems. John White Lawrence Berkeley National Lab

Parallel File Systems. John White Lawrence Berkeley National Lab Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation

More information

Introduction to Advanced Research Computing (ARC)

Introduction to Advanced Research Computing (ARC) Introduction to Advanced Research Computing (ARC) September 29, 2016 By: Pier-Luc St-Onge 1 Financial Partners 2 Setup for the workshop 1. Get a user ID and password paper (provided in class): ##: **********

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

HPC at UZH: status and plans

HPC at UZH: status and plans HPC at UZH: status and plans Dec. 4, 2013 This presentation s purpose Meet the sysadmin team. Update on what s coming soon in Schroedinger s HW. Review old and new usage policies. Discussion (later on).

More information

Graham vs legacy systems

Graham vs legacy systems New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet

More information

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 3/28/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Habanero Operating Committee. January

Habanero Operating Committee. January Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes 3. Storage 4. Network Execute Nodes Type Quantity Standard 176 High Memory 32 GPU* 14 Total 222 Execute Nodes

More information

Illinois Proposal Considerations Greg Bauer

Illinois Proposal Considerations Greg Bauer - 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

New User Seminar: Part 2 (best practices)

New User Seminar: Part 2 (best practices) New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency

More information

The Why and How of HPC-Cloud Hybrids with OpenStack

The Why and How of HPC-Cloud Hybrids with OpenStack The Why and How of HPC-Cloud Hybrids with OpenStack OpenStack Australia Day Melbourne June, 2017 Lev Lafayette, HPC Support and Training Officer, University of Melbourne lev.lafayette@unimelb.edu.au 1.0

More information

Chapter 6: File Systems

Chapter 6: File Systems Chapter 6: File Systems File systems Files Directories & naming File system implementation Example file systems Chapter 6 2 Long-term information storage Must store large amounts of data Gigabytes -> terabytes

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Cerebro Quick Start Guide

Cerebro Quick Start Guide Cerebro Quick Start Guide Overview of the system Cerebro consists of a total of 64 Ivy Bridge processors E5-4650 v2 with 10 cores each, 14 TB of memory and 24 TB of local disk. Table 1 shows the hardware

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Agenda 1 Agenda-Day 1 HPC Overview What is a cluster? Shared v.s. Distributed Parallel v.s. Massively Parallel Interconnects

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI Systems Andrew Gustafson The Machines at MSI Machine Type: Cluster Source: http://en.wikipedia.org/wiki/cluster_%28computing%29 Machine Type: Cluster

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Introduction to High Performance Computing

Introduction to High Performance Computing Introduction to High Performance Computing By Pier-Luc St-Onge pier-luc.st-onge@mcgill.ca September 11, 2014 Objectives Familiarize new users with the concepts of High Performance Computing (HPC). Outline

More information

Running Jobs, Submission Scripts, Modules

Running Jobs, Submission Scripts, Modules 9/17/15 Running Jobs, Submission Scripts, Modules 16,384 cores total of about 21,000 cores today Infiniband interconnect >3PB fast, high-availability, storage GPGPUs Large memory nodes (512GB to 1TB of

More information

Data storage services at KEK/CRC -- status and plan

Data storage services at KEK/CRC -- status and plan Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The

More information

DATARMOR: Comment s'y préparer? Tina Odaka

DATARMOR: Comment s'y préparer? Tina Odaka DATARMOR: Comment s'y préparer? Tina Odaka 30.09.2016 PLAN DATARMOR: Detailed explanation on hard ware What can you do today to be ready for DATARMOR DATARMOR : convention de nommage ClusterHPC REF SCRATCH

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

CS 5523 Operating Systems: Memory Management (SGG-8)

CS 5523 Operating Systems: Memory Management (SGG-8) CS 5523 Operating Systems: Memory Management (SGG-8) Instructor: Dr Tongping Liu Thank Dr Dakai Zhu, Dr Palden Lama, and Dr Tim Richards (UMASS) for providing their slides Outline Simple memory management:

More information

Supercomputing environment TMA4280 Introduction to Supercomputing

Supercomputing environment TMA4280 Introduction to Supercomputing Supercomputing environment TMA4280 Introduction to Supercomputing NTNU, IMF February 21. 2018 1 Supercomputing environment Supercomputers use UNIX-type operating systems. Predominantly Linux. Using a shell

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

Introduction to File Structures

Introduction to File Structures 1 Introduction to File Structures Introduction to File Organization Data processing from a computer science perspective: Storage of data Organization of data Access to data This will be built on your knowledge

More information

An ESS implementation in a Tier 1 HPC Centre

An ESS implementation in a Tier 1 HPC Centre An ESS implementation in a Tier 1 HPC Centre Maximising Performance - the NeSI Experience José Higino (NeSI Platforms and NIWA, HPC Systems Engineer) Outline What is NeSI? The National Platforms Framework

More information

Batch Systems. Running calculations on HPC resources

Batch Systems. Running calculations on HPC resources Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between

More information

Introduction to CINECA HPC Environment

Introduction to CINECA HPC Environment Introduction to CINECA HPC Environment 23nd Summer School on Parallel Computing 19-30 May 2014 m.cestari@cineca.it, i.baccarelli@cineca.it Goals You will learn: The basic overview of CINECA HPC systems

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Tech Computer Center Documentation

Tech Computer Center Documentation Tech Computer Center Documentation Release 0 TCC Doc February 17, 2014 Contents 1 TCC s User Documentation 1 1.1 TCC SGI Altix ICE Cluster User s Guide................................ 1 i ii CHAPTER 1

More information

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/

Duke Compute Cluster Workshop. 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ Duke Compute Cluster Workshop 11/10/2016 Tom Milledge h:ps://rc.duke.edu/ rescompu>ng@duke.edu Outline of talk Overview of Research Compu>ng resources Duke Compute Cluster overview Running interac>ve and

More information

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization 2 Glenn Bresnahan Director, SCV MGHPCC Buy-in Program Kadin Tseng HPC Programmer/Consultant

More information

Experiences using a multi-tiered GPFS file system at Mount Sinai. Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang

Experiences using a multi-tiered GPFS file system at Mount Sinai. Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang Experiences using a multi-tiered GPFS file system at Mount Sinai Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang Outline 1. Storage summary 2. Planning and Migration 3. Challenges

More information

Enhancing Checkpoint Performance with Staging IO & SSD

Enhancing Checkpoint Performance with Staging IO & SSD Enhancing Checkpoint Performance with Staging IO & SSD Xiangyong Ouyang Sonya Marcarelli Dhabaleswar K. Panda Department of Computer Science & Engineering The Ohio State University Outline Motivation and

More information

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:- HPC DOCUMENTATION 1. Hardware Resource :- Our HPC consists of Blade chassis with 5 blade servers and one GPU rack server. a.total available cores for computing: - 96 cores. b.cores reserved and dedicated

More information

CS/Math 471: Intro. to Scientific Computing

CS/Math 471: Intro. to Scientific Computing CS/Math 471: Intro. to Scientific Computing Getting Started with High Performance Computing Matthew Fricke, PhD Center for Advanced Research Computing Table of contents 1. The Center for Advanced Research

More information

IBM Spectrum Protect HSM for Windows Version Administration Guide IBM

IBM Spectrum Protect HSM for Windows Version Administration Guide IBM IBM Spectrum Protect HSM for Windows Version 8.1.0 Administration Guide IBM IBM Spectrum Protect HSM for Windows Version 8.1.0 Administration Guide IBM Note: Before you use this information and the product

More information

Scalability Guidelines

Scalability Guidelines Version 2.0, Service Pack 5.2, March 29, 2005 Overview Introduction This document provides hardware and software recommendations for deploying SiteProtector 2.0, Service Pack 5.2, as follows: small deployment

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Active File Archiving on Windows 2012 R2

Active File Archiving on Windows 2012 R2 Active File Archiving on Windows 2012 R2 Isaac Barton - Assistant Director of IT for Enterprise Infrastructure Services & Cliff Giddens - Systems Support Associate - Enterprise Infrastructure Services

More information

Leveraging Burst Buffer Coordination to Prevent I/O Interference

Leveraging Burst Buffer Coordination to Prevent I/O Interference Leveraging Burst Buffer Coordination to Prevent I/O Interference Anthony Kougkas akougkas@hawk.iit.edu Matthieu Dorier, Rob Latham, Rob Ross, Xian-He Sun Wednesday, October 26th Baltimore, USA Outline

More information

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing Zheng Meyer-Zhao - zheng.meyer-zhao@surfsara.nl Consultant Clustercomputing Outline SURFsara About us What we do Cartesius and Lisa Architectures and Specifications File systems Funding Hands-on Logging

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

INTRODUCTION TO THE CLUSTER

INTRODUCTION TO THE CLUSTER INTRODUCTION TO THE CLUSTER WHAT IS A CLUSTER? A computer cluster consists of a group of interconnected servers (nodes) that work together to form a single logical system. COMPUTE NODES GATEWAYS SCHEDULER

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

Using the IAC Chimera Cluster

Using the IAC Chimera Cluster Using the IAC Chimera Cluster Ángel de Vicente (Tel.: x5387) SIE de Investigación y Enseñanza Chimera overview Beowulf type cluster Chimera: a monstrous creature made of the parts of multiple animals.

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende Introduction to NCAR HPC 25 May 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical overview of our HPC systems The NCAR computing environment Accessing software on Cheyenne

More information

The RAMDISK Storage Accelerator

The RAMDISK Storage Accelerator The RAMDISK Storage Accelerator A Method of Accelerating I/O Performance on HPC Systems Using RAMDISKs Tim Wickberg, Christopher D. Carothers wickbt@rpi.edu, chrisc@cs.rpi.edu Rensselaer Polytechnic Institute

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit CPU cores : individual processing units within a Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Knights Landing production environment on MARCONI

Knights Landing production environment on MARCONI Knights Landing production environment on MARCONI Alessandro Marani - a.marani@cineca.it March 20th, 2017 Agenda In this presentation, we will discuss - How we interact with KNL environment on MARCONI

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

Introduction to Python for Scientific Computing

Introduction to Python for Scientific Computing 1 Introduction to Python for Scientific Computing http://tinyurl.com/cq-intro-python-20151022 By: Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@calculquebec.ca, Bart.Oldeman@mcgill.ca Partners and

More information

CouchDB-based system for data management in a Grid environment Implementation and Experience

CouchDB-based system for data management in a Grid environment Implementation and Experience CouchDB-based system for data management in a Grid environment Implementation and Experience Hassen Riahi IT/SDC, CERN Outline Context Problematic and strategy System architecture Integration and deployment

More information

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende Introduction to Cheyenne 12 January, 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical specs of the Cheyenne supercomputer and expanded GLADE file systems The Cheyenne computing

More information

Coordinating Parallel HSM in Object-based Cluster Filesystems

Coordinating Parallel HSM in Object-based Cluster Filesystems Coordinating Parallel HSM in Object-based Cluster Filesystems Dingshan He, Xianbo Zhang, David Du University of Minnesota Gary Grider Los Alamos National Lab Agenda Motivations Parallel archiving/retrieving

More information

Hitachi Data Instance Manager Software Version Release Notes

Hitachi Data Instance Manager Software Version Release Notes Hitachi Data Instance Manager Software Version 4.2.3 Release Notes Contents Contents... 1 About this document... 2 Intended audience... 2 Getting help... 2 About this release... 2 Product package contents...

More information

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,

More information

XSEDE New User Tutorial

XSEDE New User Tutorial June 12, 2015 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Please remember to sign in for today s event: http://bit.ly/1fashvo Also, please

More information

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER

More information

Extraordinary HPC file system solutions at KIT

Extraordinary HPC file system solutions at KIT Extraordinary HPC file system solutions at KIT Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State Roland of Baden-Württemberg Laifer Lustre and tools for ldiskfs investigation

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu

Duke Compute Cluster Workshop. 10/04/2018 Tom Milledge rc.duke.edu Duke Compute Cluster Workshop 10/04/2018 Tom Milledge rc.duke.edu rescomputing@duke.edu Outline of talk Overview of Research Computing resources Duke Compute Cluster overview Running interactive and batch

More information

Insights into TSM/HSM for UNIX and Windows

Insights into TSM/HSM for UNIX and Windows IBM Software Group Insights into TSM/HSM for UNIX and Windows Oxford University TSM Symposium 2005 Jens-Peter Akelbein (akelbein@de.ibm.com) IBM Tivoli Storage SW Development 1 IBM Software Group Tivoli

More information

File Systems. What do we need to know?

File Systems. What do we need to know? File Systems Chapter 4 1 What do we need to know? How are files viewed on different OS s? What is a file system from the programmer s viewpoint? You mostly know this, but we ll review the main points.

More information

Lustre HSM at Cambridge. Early user experience using Intel Lemur HSM agent

Lustre HSM at Cambridge. Early user experience using Intel Lemur HSM agent Lustre HSM at Cambridge Early user experience using Intel Lemur HSM agent Matt Rásó-Barnett Wojciech Turek Research Computing Services @ Cambridge University-wide service with broad remit to provide research

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!? X Grid Engine Where X stands for Oracle Univa Open Son of more to come...?!? Carsten Preuss on behalf of Scientific Computing High Performance Computing Scheduler candidates LSF too expensive PBS / Torque

More information

J O U R N É E D E R E N C O N T R E D E S U T I L I S A T E U R S D U P Ô L E D E C A L C U L I N T E N S I F P O U R L A M E R P I E R R E C O T T Y

J O U R N É E D E R E N C O N T R E D E S U T I L I S A T E U R S D U P Ô L E D E C A L C U L I N T E N S I F P O U R L A M E R P I E R R E C O T T Y J O U R N É E D E R E N C O N T R E D E S U T I L I S A T E U R S D U P Ô L E D E C A L C U L I N T E N S I F P O U R L A M E R P I E R R E C O T T Y Le programme de la journée 9 h 40-12 h 10 DATARMOR:

More information

libhio: Optimizing IO on Cray XC Systems With DataWarp

libhio: Optimizing IO on Cray XC Systems With DataWarp libhio: Optimizing IO on Cray XC Systems With DataWarp May 9, 2017 Nathan Hjelm Cray Users Group May 9, 2017 Los Alamos National Laboratory LA-UR-17-23841 5/8/2017 1 Outline Background HIO Design Functionality

More information

Allinea DDT Debugger. Dan Mazur, McGill HPC March 5,

Allinea DDT Debugger. Dan Mazur, McGill HPC  March 5, Allinea DDT Debugger Dan Mazur, McGill HPC daniel.mazur@mcgill.ca guillimin@calculquebec.ca March 5, 2015 1 Outline Introduction and motivation Guillimin login and DDT configuration Compiling for a debugger

More information

Minerva User Group 2018

Minerva User Group 2018 Minerva User Group 2018 Patricia Kovatch Bhupender Thakur, PhD Francesca Tartaglione, MS Dansha Jiang, PhD Eugene Fluder, PhD Hyung Min Cho, PhD Lili Gai, PhD Jan 18, 2018 Outline Welcome and general comments

More information

XSEDE New User Tutorial

XSEDE New User Tutorial May 13, 2016 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Please complete a short on-line survey about this module at http://bit.ly/hamptonxsede.

More information

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012

Slurm and Abel job scripts. Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Slurm and Abel job scripts Katerina Michalickova The Research Computing Services Group SUF/USIT October 23, 2012 Abel in numbers Nodes - 600+ Cores - 10000+ (1 node->2 processors->16 cores) Total memory

More information

Oracle Database 12c: JMS Sharded Queues

Oracle Database 12c: JMS Sharded Queues Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE

More information

Backups and archives: What s the scoop?

Backups and archives: What s the scoop? E-Guide Backups and archives: What s the scoop? What s a backup and what s an archive? For starters, one of the differences worth noting is that a backup is always a copy while an archive should be original

More information

Brutus. Above and beyond Hreidar and Gonzales

Brutus. Above and beyond Hreidar and Gonzales Brutus Above and beyond Hreidar and Gonzales Dr. Olivier Byrde Head of HPC Group, IT Services, ETH Zurich Teodoro Brasacchio HPC Group, IT Services, ETH Zurich 1 Outline High-performance computing at ETH

More information