HPCC User Group Meeting

Size: px
Start display at page:

Download "HPCC User Group Meeting"

Transcription

1 High Performance Computing Center HPCC User Group Meeting Planned Update of Quanah and Hrothgar Eric Rees Research Associate - HPCC June 6, 2018

2 HPCC User Group Meeting Agenda 6/6/2018 HPCC User Group Meeting Agenda State of the Clusters Updating the Clusters Shutdown Schedule Getting Help Q&A

3 HPCC User Group Meeting State of the Clusters

4 Updating the Clusters Quanah Cluster Commissioned in 2017 Updated in Q Consists of 467 Nodes 16,812 Cores 87.56TB Total RAM Xeon E5-2695v4 Broadwell Processors Omnipath (100 Gbps) Fabric Hrothgar Cluster Commissioned in 2011 Updated in 2014 Downgraded in Q Consists of 455 Nodes (33 Nodes in CC) 6,528 Cores (660 in CC) A mixture of Xeon X5660 Westmere Processors Xeon E Ivy Bridge Processors DDR & QDR (20 Gbps) Infiniband Fabric

5 Cluster Utilization Primary Queues Only

6 Cluster Updates Previous Cluster Updates Q Commissioned Quanah for general use Q Upgraded Quanah / Downgraded Hrothgar Switched to OpenHPC/Warewulf for node provisioning on Hrothgar Isolated the storage network Invested significant time into stabilizing the Omnipath network (Quanah) Q Invested significant time into stabilizing the Infiniband network (Hrothgar) Invested in a high speed line (10gb) to the Chemistry server room. Future Cluster Updates Early Q (July) Updating the Scheduler to UGE Updating the Scheduler s policies Unifying the Quanah, Hrothgar and CC environment (OS, installed software, containers) Stabilize Hrothgar nodes by updating to CentOS 7 Increase the size of Hrothgar s serial queue by adding 239 nodes (2,868 cores) to it Update the Software Installation Policy Late Q Replace the Hrothgar UPS (Tentative) Early-Mid Q Commission the new HPCC generator

7 HPCC User Group Meeting Updating the Clusters

8 Updating the Clusters Three major changes will take place during the July shutdown: Update the Scheduler Update the version to Switch to a share-tree based policy Update all nodes to CentOS 7.4 Bring all clusters into the same environment Update the Software Installation Policy

9 HPCC User Group Meeting Updating the Scheduler

10 Updating the Schedule The following changes will be made to the Omni queue: Updating to a new version of the Univa scheduler (8.5.5) Parallel environment fill (-pe fill) will be removed Switching to a share-tree based policy Adding two new projects (xlquanah and hep) Implementing new features (JSV & RQS) Implementing memory constraints

11 Updating the Schedule We originally were on a share based policy. HPCC encountered a new bug in the UGE Scheduler Our systems couldn t share resources fairly due to the coding error Univa was unable to determine the source of the problem Univa s Resolution: Perform a fresh re-install of the UGE Scheduler How does UGE assign priority? weight deadline prio = weight priority pprio + weight urgency normalized hrr + weight waitingtime waitingtimeinseconds + timeremaininginseconds + weight ticket normalized ftckt + otckt + stckt

12 Switching to a share-tree based policy The updated share tree policy will cause the following: Max priority value: Cluster usage will be the primary contributor to the number of share tickets you receive Share is based on the number of core hours you have used, are currently using, and are projected to use based on currently waiting jobs. Your share (as a value) is halved every 7 days. Run 10 core hour job today, 7 days counts as 5, 14 days counts as 2.5. Waiting time / Deadline Time / Slot Urgency will weigh in very little.

13 Switching to a share-tree based policy Expected outcomes Usage will now play heavily into a user s share of the cluster: Heavy users will take a hit to their job priority and run less often Moderate users will likely get a boost to their job priorities and run more often. Minor/Rare users will likely not notice any differences Example: User A submits two 3600 core jobs. User B submits two hundred 36 core jobs. A will get to run 1 job, then B will get to run 100 jobs before A is granted their second job.

14 Additional Projects The omni queue will now contain 3 projects: quanah Default: Project quanah will be able to use all 16,812 cores, maximum 48 hour run time and will be able to use the sm and mpi parallel environments. xlquanah Project xlquanah will be able to use 144 cores, maximum of 120 hour run time and will be able to only use the sm parallel environment. hep Project hep will be able to use 720 cores with no other restrictions. Only available to the High Energy Physics user group.

15 Implementing New Features The omni queue will now make use of the following: Resource Quota Sets (RQS) The RQS will act as an enforcer of some job specifications. Primarily the RQS will help enforce run-time limits on jobs. Job Submission Verifier (JSV) This script will run after you submit a qsub but before your job is officially submitted. The JSV will either accept your job, make any necessary changes to your job then accept it, or reject your job.

16 Job Submission Verifier JSV for the omni queue The JSV will do the following Ensure only jobs that could possibly run actually get accepted. (Accept, Correct, or Reject) Ensure time limits, memory and other constraints are enforced. Try to prevent the most common reasons for jobs to get hung in qw Requesting >36 cores for -pe sm or non-multiple of 36 for -pe mpi Enforce resource time requests: 48 hours on quanah 120 hours on xlquanah

17 Job Submission Verifier JSV for the omni queue (continued) Enforce memory constraints It not defined by user, then calculate (# of requested slots * ~5.3GB) and set it in the job. If requested memory size is greater than total # of requested nodes, then reject the job. If the job is serial (sm) the maximum requested memory should not exceed 192GB.

18 Implementing Memory Constraints Memory Constraints All jobs will now have a set amount memory they are limited to. Users can define their memory resources using -l h_vmem=<int>[m G] Example for requesting 6 GB of memory: -l h_vmem=6g How do memory constraints work? Omni queue will make use of soft memory constraints. Any job that goes over its requested maximum memory will be killed only if there is memory contention. Example 1: Your job requested 10GB and uses 11GB. No other user is on the system so your job continues. Example 2: Your job requested 10GB and uses 11GB. All remaining memory has been given to other jobs on the same node (memory contention) so your job is killed.

19 HPCC User Group Meeting Updating the Environment

20 Updating the Environment The following changes will be made to all nodes: Operating System will be brought up to CentOS version 7.4 Quanah is currently CentOS 7.3 Hrothgar is currently CentOS 6.9 All HPCC installed software will now be converted to RPMs and installed locally on the nodes Currently all software is installed from the NFS servers Modules will now be the same for all clusters Singularity containers will be available for use on all clusters

21 Updating the Environment Why change the environment? Stability OpenHPC does not support CentOS 6.9. As such, Hrothgar has been less stable than Quanah. This update will improve the stability of Hrothgar. Migrating to a common OS will make switching between clusters easier and reduce the number of compilations users may need to perform. Security Some Spectre and Meltdown protections will be implemented by moving to CentOS 7.4.

22 Updating the Environment Why change the environment? Speed Migrating to a common OS allows us to install software directly to the nodes in an automated fashion. Running applications locally will improve the runtime of many commonly used applications.

23 Updating the Environment What does this mean for me? Software may require re-compilation This will primarily apply to Hrothgar, however some Quanah applications may need it as well You will be able to run containers on Hrothgar just as you do on Quanah Module paths will be the same on Quanah and Hrothgar This will making switching between the clusters easier This will resolve the problems with module spider you see on Hrothgar You will be able to submit jobs to West or Ivy from any node no longer restricted to Hrothgar for West and Ivy for Ivy.

24 HPCC User Group Meeting July Shutdown Schedule

25 July Shutdown Shutdown Timeline July 6 July 9 July 10 July 11 5:00 PM Disable and begin draining all queues Shut down all clusters Upgrade the OS Install the new scheduler Test new scheduler and OS updates Continue testing and debugging any issues Complete testing 5:00 PM - Reopen clusters for general use

26 Quanah, Hrothgar and Shutdown Q&A Session When will the HPCC fully shutdown? Approximately 8:00 am on July 9, 2018 Will the community clusters change? Yes, your systems will be upgraded from CentOS 6.9 to CentOS 7.4 Software may require recompilation No, job submissions, controlled access and current resources will stay the same Will my data be affected? No, we do not anticipate any unintended alteration or loss of data stored on our storage servers

27 Quanah, Hrothgar and Shutdown Q&A Session When will there be another purge of /lustre/scratch? Next purge will occur before the shutdown explaining which files will be deleted will go out soon If you have data you can t afford to lose, then back it up! Can I buy-in to Quanah like Hrothgar? Yes. For efficiency, we are moving to class of service rather than isolated dedicated resources for the "community cluster shared service Buying in gives you priority access to your portion of the Quanah cluster Unused compute cycles will be shared among other users You still have ownership of definite machines for capital expenditure and inventory purposes

28 Quanah, Hrothgar and Shutdown Q&A Session Can I purchase storage space? Yes, we are considering several options for storage based on relative needs for speed, reliability, and potential optional backup. Developing a Research Data Management System (RDMS) service in conjunction with the TTU Library. For more information, please contact Dr. Alan Sill (alan.sill@ttu.edu) Dr. Eric Rees (eric.rees@ttu.edu)

29 Quanah, Hrothgar and Shutdown Q&A Session Can I access my data during the shutdown? Yes, the Globus Connect endpoint (Terra) will remain online during the shutdown. Can I move my data before the shutdown Yes, please use Globus Connect See HPCC User Guides or visit: Any questions before we continue?

30 HPCC User Group Meeting User Engagement

31 User Engagement HPCC Seminars In the process of planning these out. Monthly or Bi-monthly during the long semesters Meetings will include Presentations from TTU researchers of research performed (in part) using HPCC resources. News regarding the HPCC Planned shutdowns Planned changes HPCC User Training Restarting the HPCC user training courses General / New User training courses will be offered during the 2 nd and 4 th week of each long semester and the 1 st and 2 nd week of each summer semester. Training courses on topics of interest or advanced topics will occur as interest or need arises. Surveys regarding user needs and current or future HPCC services.

32 HPCC User Training Topics Next HPCC User Training Date: June 12 th Topic: New User Training Future Training Topics General / New User Training Training on topics of interest Advanced Job Scheduling Singularity Containers (Building and Using)

33 HPCC User Group Meeting Getting Help

34 Getting Help Best ways to get help Visit our website - hpcc.ttu.edu Most user guides have been updated New user guides are being added Submit a support ticket Send an to hpccsupport@ttu.edu

35 July Shutdown Shutdown Timeline July 6 July 9 July 10 July 11 5:00 PM Disable and begin draining all queues Shut down all clusters Upgrade the OS Install the new scheduler Test new scheduler and OS updates Continue testing and debugging any issues Complete testing 5:00 PM - Reopen clusters for general use What should you do to prepare? Ensure any jobs you wish to run are queued well before the shutdown Test your Hrothgar applications using the serial queue (which has been converted to CentOS7) Instructions will be available on our website soon - hpcc.ttu.edu Qlogin to the Hrothgar serial queue and ensure any required modules are still available Questions? Comments? Concerns? me at eric.rees@ttu.edu or send an to hpccsupport@ttu.edu

36

HPCC New User Training

HPCC New User Training High Performance Computing Center HPCC New User Training Getting Started on HPCC Resources Eric Rees, Ph.D. High Performance Computing Center Fall 2018 HPCC User Training Agenda HPCC User Training Agenda

More information

HPCC - Hrothgar Getting Started User Guide Gromacs

HPCC - Hrothgar Getting Started User Guide Gromacs HPCC - Hrothgar Getting Started User Guide Gromacs High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment... 3 For

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

Subject: Request for proposal for a high-performance computing cluster

Subject: Request for proposal for a high-performance computing cluster Ref. No.: SSCU/GPR/T/2017-79 2 May 2017 Subject: Request for proposal for a high-performance computing cluster Dear Madam/Sir, I wish to purchase a high-performance computing (HPC) cluster comprising of

More information

Graham vs legacy systems

Graham vs legacy systems New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Comet Virtualization Code & Design Sprint

Comet Virtualization Code & Design Sprint Comet Virtualization Code & Design Sprint SDSC September 23-24 Rick Wagner San Diego Supercomputer Center Meeting Goals Build personal connections between the IU and SDSC members of the Comet team working

More information

Genius Quick Start Guide

Genius Quick Start Guide Genius Quick Start Guide Overview of the system Genius consists of a total of 116 nodes with 2 Skylake Xeon Gold 6140 processors. Each with 18 cores, at least 192GB of memory and 800 GB of local SSD disk.

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI Systems Andrew Gustafson The Machines at MSI Machine Type: Cluster Source: http://en.wikipedia.org/wiki/cluster_%28computing%29 Machine Type: Cluster

More information

Server Virtualization and Optimization at HSBC. John Gibson Chief Technical Specialist HSBC Bank plc

Server Virtualization and Optimization at HSBC. John Gibson Chief Technical Specialist HSBC Bank plc Server Virtualization and Optimization at HSBC John Gibson Chief Technical Specialist HSBC Bank plc Background Over 5,500 Windows servers in the last 6 years. Historically, Windows technology dictated

More information

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines Götz Waschk Technical Seminar, Zeuthen April 27, 2010 > Introduction > Hardware Infiniband

More information

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services WVU RESEARCH COMPUTING INTRODUCTION Introduction to WVU s Research Computing Services WHO ARE WE? Division of Information Technology Services Funded through WVU Research Corporation Provide centralized

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

RGB: Redfish Green500 Benchmarker

RGB: Redfish Green500 Benchmarker RGB: Redfish Green500 Benchmarker A Green500 Benchmark Tool Using Redfish Technology Presenter: Elham Hojati Industry: Lead faculty: Students: Mr. Jon Hass, Dell Inc. Dr. Alan Sill, TTU Dr. Yong Chen,

More information

INTRODUCTION TO THE CLUSTER

INTRODUCTION TO THE CLUSTER INTRODUCTION TO THE CLUSTER WHAT IS A CLUSTER? A computer cluster consists of a group of interconnected servers (nodes) that work together to form a single logical system. COMPUTE NODES GATEWAYS SCHEDULER

More information

Planning a Successful OS/DB Migration

Planning a Successful OS/DB Migration Planning a Successful OS/DB Migration Wednesday February 3, 1-2pm Central Frank Powell Chief Operating Officer Download the presentation recording with audio from the Symmetry Knowledge Center www.sym-corp.com/knowledge-center

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

High Performance Computing Resources at MSU

High Performance Computing Resources at MSU MICHIGAN STATE UNIVERSITY High Performance Computing Resources at MSU Last Update: August 15, 2017 Institute for Cyber-Enabled Research Misson icer is MSU s central research computing facility. The unit

More information

InfiniBand and Mellanox UFM Fundamentals

InfiniBand and Mellanox UFM Fundamentals InfiniBand and Mellanox UFM Fundamentals Part Number: MTR-IB-UFM-OST-A Duration: 3 Days What's in it for me? Where do I start learning about InfiniBand? How can I gain the tools to manage this fabric?

More information

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018 HTCondor on Titan Wisconsin IceCube Particle Astrophysics Center Vladimir Brik HTCondor Week May 2018 Overview of Titan Cray XK7 Supercomputer at Oak Ridge Leadership Computing Facility Ranked #5 by TOP500

More information

BRC HPC Services/Savio

BRC HPC Services/Savio BRC HPC Services/Savio Krishna Muriki and Gregory Kurtzer LBNL/BRC kmuriki@berkeley.edu, gmk@lbl.gov SAVIO - The Need Has Been Stated Inception and design was based on a specific need articulated by Eliot

More information

Answers to Federal Reserve Questions. Training for University of Richmond

Answers to Federal Reserve Questions. Training for University of Richmond Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node

More information

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!? X Grid Engine Where X stands for Oracle Univa Open Son of more to come...?!? Carsten Preuss on behalf of Scientific Computing High Performance Computing Scheduler candidates LSF too expensive PBS / Torque

More information

Request for Proposals (RFP) IT Infrastructure Upgrade & Storage Solution January 26, 2017 Proposal Deadline: February 8, 2017

Request for Proposals (RFP) IT Infrastructure Upgrade & Storage Solution January 26, 2017 Proposal Deadline: February 8, 2017 Request for Proposals (RFP) IT Infrastructure Upgrade & Storage Solution January 26, 2017 Proposal Deadline: February 8, 2017 I. Overview Tipton County Government is seeking an IT Infrastructure Upgrade

More information

Answers to Federal Reserve Questions. Administrator Training for University of Richmond

Answers to Federal Reserve Questions. Administrator Training for University of Richmond Answers to Federal Reserve Questions Administrator Training for University of Richmond 2 Agenda Cluster overview Physics hardware Chemistry hardware Software Modules, ACT Utils, Cloner GridEngine overview

More information

Introduction to the Cluster

Introduction to the Cluster Introduction to the Cluster Advanced Computing Center for Research and Education http://www.accre.vanderbilt.edu Follow us on Twitter for important news and updates: @ACCREVandy The Cluster We will be

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What

More information

Guillimin HPC Users Meeting. Bart Oldeman

Guillimin HPC Users Meeting. Bart Oldeman June 19, 2014 Bart Oldeman bart.oldeman@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE WHITE PAPER I JUNE 2010 LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE How an Open, Modular Storage Platform Gives Enterprises the Agility to Scale On Demand and Adapt to Constant Change. LEVERAGING A PERSISTENT

More information

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende Introduction to NCAR HPC 25 May 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical overview of our HPC systems The NCAR computing environment Accessing software on Cheyenne

More information

Pivotal Greenplum Database Azure Marketplace v4.0 Release Notes

Pivotal Greenplum Database Azure Marketplace v4.0 Release Notes Pivotal Greenplum Database Azure Marketplace v4.0 Release Notes Updated: February 2019 Overview Pivotal Greenplum is deployed on Azure using an Azure Resource Manager (ARM) template that has been optimized

More information

Guillimin HPC Users Meeting April 13, 2017

Guillimin HPC Users Meeting April 13, 2017 Guillimin HPC Users Meeting April 13, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit to

More information

Cloud Computing For Researchers

Cloud Computing For Researchers Cloud Computing For Researchers August, 2016 Compute Canada is often asked about the potential of outsourcing to commercial clouds. This has been investigated as an alternative or supplement to purchasing

More information

Welcome to Hitachi Vantara Customer Support for Pentaho

Welcome to Hitachi Vantara Customer Support for Pentaho Welcome to Hitachi Vantara Customer Support for Pentaho This page intentionally left blank. Contents Getting Started with the Customer Portal... 1 Knowledge Base for Pentaho... 2 Hitachi Vantara Technical

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Getting Started with Amazon EC2 and Amazon SQS

Getting Started with Amazon EC2 and Amazon SQS Getting Started with Amazon EC2 and Amazon SQS Building Scalable, Reliable Amazon EC2 Applications with Amazon SQS Overview Amazon Elastic Compute Cloud (EC2) is a web service that provides resizable compute

More information

High Performance Computing in C and C++

High Performance Computing in C and C++ High Performance Computing in C and C++ Rita Borgo Computer Science Department, Swansea University WELCOME BACK Course Administration Contact Details Dr. Rita Borgo Home page: http://cs.swan.ac.uk/~csrb/

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Shark Cluster Overview

Shark Cluster Overview Shark Cluster Overview 51 Execution Nodes 1 Head Node (shark) 2 Graphical login nodes 800 Cores = slots 714 TB Storage RAW Slide 1/17 Introduction What is a High Performance Compute (HPC) cluster? A HPC

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit CPU cores : individual processing units within a Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper Issue 01 Date 2014-03-26 HUAWEI TECHNOLOGIES CO., LTD. 2014. All rights reserved. No part of this document may be reproduced

More information

To Upgrade or to Re-implement Dynamics NAV. Presented by: Abhishek Agnihotri

To Upgrade or to Re-implement Dynamics NAV. Presented by: Abhishek Agnihotri To Upgrade or to Re-implement Dynamics NAV Presented by: Abhishek Agnihotri Agenda When is the right time to review current product and qualify for upgrade/re-implementation Difference between upgrade

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

SuperMike-II Launch Workshop. System Overview and Allocations

SuperMike-II Launch Workshop. System Overview and Allocations : System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of

More information

About the XenClient Enterprise Solution

About the XenClient Enterprise Solution About the XenClient Enterprise Solution About the XenClient Enterprise Solution About the XenClient Enterprise Solution XenClient Enterprise is a distributed desktop virtualization solution that makes

More information

HPCC - Hrothgar. Getting Started User Guide TotalView. High Performance Computing Center Texas Tech University

HPCC - Hrothgar. Getting Started User Guide TotalView. High Performance Computing Center Texas Tech University HPCC - Hrothgar Getting Started User Guide TotalView High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents *This user guide is under development... 3 1. Introduction...

More information

Guillimin HPC Users Meeting November 16, 2017

Guillimin HPC Users Meeting November 16, 2017 Guillimin HPC Users Meeting November 16, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation

More information

Beam Software User Policies

Beam Software User Policies Beam Software User Policies Table of Contents 1 Overview... 3 2 Technical Support... 3 2.1 What Do I Do When I Have a Question or Encounter a Problem?... 4 2.1.1 Telephone Submissions... 4 2.1.2 Critical

More information

How to Use a Supercomputer - A Boot Camp

How to Use a Supercomputer - A Boot Camp How to Use a Supercomputer - A Boot Camp Shelley Knuth Peter Ruprecht shelley.knuth@colorado.edu peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Today we will discuss: Who Research Computing is

More information

CS 241 Data Organization using C

CS 241 Data Organization using C CS 241 Data Organization using C Fall 2018 Instructor Name: Dr. Marie Vasek Contact: Private message me on the course Piazza page. Office: Farris 2120 Office Hours: Tuesday 2-4pm and Thursday 9:30-11am

More information

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting. Bryan Caron July 17, 2014 Bryan Caron bryan.caron@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

Workload management at KEK/CRC -- status and plan

Workload management at KEK/CRC -- status and plan Workload management at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai CPU in KEKCC Work server & Batch server Xeon 5670 (2.93 GHz /

More information

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a

More information

Introduction to the Cluster

Introduction to the Cluster Follow us on Twitter for important news and updates: @ACCREVandy Introduction to the Cluster Advanced Computing Center for Research and Education http://www.accre.vanderbilt.edu The Cluster We will be

More information

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:- HPC DOCUMENTATION 1. Hardware Resource :- Our HPC consists of Blade chassis with 5 blade servers and one GPU rack server. a.total available cores for computing: - 96 cores. b.cores reserved and dedicated

More information

Actifio Test Data Management

Actifio Test Data Management Actifio Test Data Management Oracle MS SQL Faster Time To Market Start Release Time To Market (TTM) Finish Faster App Releases Faster Application Releases Faster TTM Increases Revenue Market Share Competitive

More information

Cloud Consolidation with Oracle (RAC) How much is too much?

Cloud Consolidation with Oracle (RAC) How much is too much? 1 Copyright 11, Oracle and/or its affiliates All rights reserved Cloud Consolidation with Oracle (RAC) How much is too much? Markus Michalewicz Senior Principal Product Manager Oracle RAC, Oracle America

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

June 26, Explanatory meeting for users of supercomputer system -- Overview of UGE --

June 26, Explanatory meeting for users of supercomputer system -- Overview of UGE -- June 26, 2012 Explanatory meeting for users of supercomputer system -- Overview of UGE -- What is Univa Grid Engine (UGE)? It is software that is used to construct a grid computing system. It functions

More information

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer Daniel Yorgov Department of Mathematical & Statistical Sciences, University of Colorado Denver

More information

Elastic Compute Service. Quick Start for Windows

Elastic Compute Service. Quick Start for Windows Overview Purpose of this document This document describes how to quickly create an instance running Windows, connect to an instance remotely, and deploy the environment. It is designed to walk you through

More information

Edinburgh (ECDF) Update

Edinburgh (ECDF) Update Edinburgh (ECDF) Update Wahid Bhimji On behalf of the ECDF Team HepSysMan,10 th June 2010 Edinburgh Setup Hardware upgrades Progress in last year Current Issues June-10 Hepsysman Wahid Bhimji - ECDF 1

More information

Service Level Agreement Research Computing Environment and Managed Server Hosting

Service Level Agreement Research Computing Environment and Managed Server Hosting RCE SLA 2013 Service Level Agreement Research Computing Environment and Managed Server Hosting Institute for Quantitative Social Science (IQSS) and Harvard- MIT Data Center (HMDC) August 1, 2013 1. Overview

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Microsoft 365 powered device webinar series Microsoft 365 powered device Assessment Kit. Alan Maddison, Architect Amit Bhatia, Architect

Microsoft 365 powered device webinar series Microsoft 365 powered device Assessment Kit. Alan Maddison, Architect Amit Bhatia, Architect Microsoft 365 powered device webinar series Microsoft 365 powered device Assessment Kit Alan Maddison, Architect Amit Bhatia, Architect Why did we create the Assessment kit? Assessment objectives Assess

More information

UAntwerpen, 24 June 2016

UAntwerpen, 24 June 2016 Tier-1b Info Session UAntwerpen, 24 June 2016 VSC HPC environment Tier - 0 47 PF Tier -1 623 TF Tier -2 510 Tf 16,240 CPU cores 128/256 GB memory/node IB EDR interconnect Tier -3 HOPPER/TURING STEVIN THINKING/CEREBRO

More information

How can you manage what you can t see?

How can you manage what you can t see? How can you manage what you can t see? Know what you have with Panda Cloud Systems Management Business challenge: You can t manage it if you don t know it exists. Do you have 100% permanent visibility

More information

Guillimin HPC Users Meeting March 16, 2017

Guillimin HPC Users Meeting March 16, 2017 Guillimin HPC Users Meeting March 16, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit to

More information

MyCloud Computing Business computing in the cloud, ready to go in minutes

MyCloud Computing Business computing in the cloud, ready to go in minutes MyCloud Computing Business computing in the cloud, ready to go in minutes In today s dynamic environment, businesses need to be able to respond quickly to changing demands. Using virtualised computing

More information

Advanced Topics in High Performance Scientific Computing [MA5327] Exercise 1

Advanced Topics in High Performance Scientific Computing [MA5327] Exercise 1 Advanced Topics in High Performance Scientific Computing [MA5327] Exercise 1 Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences, M17 manfred.liebmann@tum.de

More information

FHDA Communication Suite Action Items and Issues. 7/30/12 Moderate Vartan Project Team. 7/30/12 Moderate Rob Vartan, Jack

FHDA Communication Suite Action Items and Issues. 7/30/12 Moderate Vartan Project Team. 7/30/12 Moderate Rob Vartan, Jack 7 4/19/12 Project Agenda Document Exchange Environment 7/11 Vartan will send documentation package, target date 7/13, for review by the team. Discussion and Q&A will take place as part of the transition

More information

Unified Storage and FCoE

Unified Storage and FCoE Unified Storage and FCoE Mike McNamara, NetApp February 24, 2011 Santa Clara, CA USA February 2011 1 Agenda Market Dynamics Why Unified Ethernet Storage Customer Success Stories Santa Clara, CA USA February

More information

Specific topics covered in this brief include:

Specific topics covered in this brief include: April 2008 Summary Welcome to the Fifteenth National Wind Coordinating Collaborative (NWCC) Transmission Update! Kevin Porter of Exeter Associates, Inc. led the April 28, 2008, Transmission Update conference

More information

Department of Chemistry

Department of Chemistry Department of Chemistry Guidelines for the Confocal Laser Scanning Microscopes The Department of Chemistry (CHEM) has two confocal laser scanning microscopes (confocal microscope) for imaging uses: 1.

More information

HC3 Move Powered by Carbonite

HC3 Move Powered by Carbonite HC3 Move Powered by Carbonite Quickstart Guide Document Version 1.2: 07/2018 Scale Computing 2018 1 Table of Contents Introduction 6 Terminology 6 Requirements 7 Carbonite Move 7 Scale Computing HC3 7

More information

Multiprocessor scheduling

Multiprocessor scheduling Chapter 10 Multiprocessor scheduling When a computer system contains multiple processors, a few new issues arise. Multiprocessor systems can be categorized into the following: Loosely coupled or distributed.

More information

Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications

Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications Kevin Brandstatter Dan Gordon Jason DiBabbo Ben Walters Alex Ballmer Lauren Ribordy Ioan Raicu Illinois Institute

More information

Redis for Pivotal Cloud Foundry Docs

Redis for Pivotal Cloud Foundry Docs Redis for Pivotal Cloud Foundry Docs Version 1.4 User's Guide 2018 Pivotal Software, Inc. Table of Contents Table of Contents Redis for Pivotal Cloud Foundry Redis for Pivotal Cloud Foundry Resource requirements

More information

Guillimin HPC Users Meeting December 14, 2017

Guillimin HPC Users Meeting December 14, 2017 Guillimin HPC Users Meeting December 14, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

IT Governance: Shared IT Infrastructure Advisory Committee (SIAC)

IT Governance: Shared IT Infrastructure Advisory Committee (SIAC) IT Governance: Shared IT Infrastructure Advisory Committee (SIAC) Notes Members Attending: Blanchard, Cromer, Kirmse (Chair), Frey, Lander, Robinson, Sallot Others Attending: Burdette, P. Cook, Easley,

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What

More information

Cluster Network Products

Cluster Network Products Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

XSEDE New User Training. Ritu Arora November 14, 2014

XSEDE New User Training. Ritu Arora   November 14, 2014 XSEDE New User Training Ritu Arora Email: rauta@tacc.utexas.edu November 14, 2014 1 Objectives Provide a brief overview of XSEDE Computational, Visualization and Storage Resources Extended Collaborative

More information

OSG Lessons Learned and Best Practices. Steven Timm, Fermilab OSG Consortium August 21, 2006 Site and Fabric Parallel Session

OSG Lessons Learned and Best Practices. Steven Timm, Fermilab OSG Consortium August 21, 2006 Site and Fabric Parallel Session OSG Lessons Learned and Best Practices Steven Timm, Fermilab OSG Consortium August 21, 2006 Site and Fabric Parallel Session Introduction Ziggy wants his supper at 5:30 PM Users submit most jobs at 4:59

More information

New Business and/or Issues Verizon provided a status on its action items from the New Business discussion at the August CUF meeting.

New Business and/or Issues Verizon provided a status on its action items from the New Business discussion at the August CUF meeting. October 14, 2009 Meeting Minutes Page 1 of 6 Changes of Interest Verizon reviewed enhancements to the Local Content on the Verizon Partner Solutions ( VPS ) Web site previously announced by the VPS Customer

More information

VMware vcenter Site Recovery Manager disaster recovery best practices

VMware vcenter Site Recovery Manager disaster recovery best practices Tutorial VMware vcenter Site Recovery Manager disaster recovery best practices VMware Inc. released Site Recovery Manager (SRM) in June 2008 to provide an automated solution for failover of virtual environments

More information

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016 HPCF Cray Phase 2 User Test period Cristian Simarro User Support advisory@ecmwf.int ECMWF April 18, 2016 Content Introduction Upgrade timeline Changes Hardware Software Steps for the testing on CCB Possible

More information

New User Seminar: Part 2 (best practices)

New User Seminar: Part 2 (best practices) New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency

More information

High Performance Computing (HPC) Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 29.07.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) The RWTH Compute Cluster (1/2) The Cluster provides ~300 TFlop/s No. 32 in TOP500

More information

University of California, Riverside. Computing and Communications. Computational UCR. March 28, Introduction 2

University of California, Riverside. Computing and Communications. Computational UCR. March 28, Introduction 2 University of California, Riverside Computing and Communications Computational Clusters @ UCR March 28, 2008 1 Introduction 2 2 Campus Datacenter Co-Location Facility 2 3 Three Cluster Models 2 3.1 Departmentally

More information

Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Guillimin HPC Users Meeting February 11, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Compute Canada News Scheduler Updates Software Updates Training

More information