Cluster Network Products

Size: px
Start display at page:

Download "Cluster Network Products"

Transcription

1 Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1

2 Interconnects in Top500 list 11/2009 2

3 Interconnects in Top500 list 11/2008 3

4 Cluster Network Technologies Gigabit Ethernet: The technology has matured and now offers very good performance at a very low cost. Latency performance is moderate - many Ethernet switches are designed for general LANs (store & forward) where latency reduction is not necessary the primary incentive (the latency is order of ms). Zero-copy OS-bypass message passing can be supported with programmable NIC and direct memory access. 4

5 Cluster Network Technologies Myrinet: using fibre optic cable Uses a fat-tree structure Low latency (7-10 µsec) with a peak bandwidth of 4G bps. Provides zero-copy message passing and can offload packet processing to the NIC. Uses cut-through/worm-hole switching to reduce latency. More expensive than Ethernet (a) Twisted pair cable in Ethernet (b) Fibre optic cable 5

6 Zero copy protocol 6

7 Cluster Network Technologies Quadrics: product of a strategic partnership between Quadrics & Compaq (used in ASCI/Q). Uses a fat quad-tree topology Very low latency of 2-5 µsec due to fast interconnects and highly tuned software stack (MPI libraries); bandwidth is about 2Gbps 7

8 Cluster Network Technologies InfiniBand: by Intel. Basic link speed of 2.5Gb/s. Cut-through/worm-hole switches are used. Current installations are achieving latencies of less than 7 µsec, but this is being improved. 8

9 Example Clusters 9

10 BlueGene/L No. 1 in Top500 list from Source: IBM 10

11 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours (bidirectional). Routing achieved in hardware. each link with 1.4 Gbit/s. 1.4 x 6 x 2= 16.8 Gbit/s aggregate bandwidth 11

12 BlueGene/L Other three networks: Binary combining tree Used for collective/global operations - reductions, sums, products, barriers etc. Low latency (2μS) Gigabit Ethernet I/O network Support file I/O An I/O node is responsible for performing I/O operations for 128 processors Diagnostic & control network Booting nodes, monitoring processors. Each chip has the above four network interfaces (torus, tree, i/o, diagnostics) Note specialised networks are used for different purposes - quite different from many other HPC cluster architectures. 12

13 BlueGene/L Message Passing: The BlueGene focussed a good deal of energy developing an efficient MPI implementation to reduce latency in the software stack. Using the MPICH code-base as a start-point: MPI library was enhanced with respect to machine architecture. For example, using the combining tree for reductions & broadcasts. Reading paper: Filtering Failure Logs for a BlueGene/L Prototype 13

14 ASCI Q The Q supercomputing system at Los Alamos National Laboratory (LANL) Product of Advanced Simulation and Computing (ASCI) program Used for simulation and computational modelling No. 2 in 2002 in Top500 supercomputer list 14

15 ASCI Q Classical cluster architecture SMPs (AlphaServer ES45s from HP) are put in one segment Each with four EV Ghz CPUs with 16-MB cache the whole system has 3 segments The three segments can operate independently or as a single system Aggregate 60 TeraFLOPS capability. 33 Terabytes of memory 664 TB of global storage Interconnection using Quadrics switch interconnect (QSNet) High bandwidth (250MB/s) and Low latency (5us) network. Top500 list: 15

16 Earth Simulator Built by NEC, located in the Earth Simulator Centre in Japan Used for running global climate models to evaluate the effects of global warming No.1 from

17 Earth Simulator 640 nodes, each with 8 vector processors and 16GB memory Two nodes are installed in one cabinet In total: 5120 processors (NEC SX-5) 10 TeraByte memory 700 TeraByte of disk storage and 1.6 PetaByte of Tape storage Computing capacity: 36 TFlop/s Networking: Crossbar interconnection (very expensive) Bandwidth: 16GB/s between any two nodes Latency: 5us Dual level parallelism: OpenMP in-node, MPI out of node Physical installation: Machine resides on 3th floor; Cables on 2nd ; Power generation & cooling on 1st and ground floor. 17

18 UK systems Cambridge PowerEdge 576 Dell PowerEdge 1950 compute servers Computing capability: 28TFlop/s Each server has two Dual- Core Intel Xeon 5160 processors 3GHz and 8GB of memory InfiniBand network Bandwidth: 10GBit Latency: 7us 60 TeraByte of disk storage 18

19 Cluster Workload Management Goal: maximising the delivery of resources to jobs, given job requirements and local policy restrictions Three parties Users: supplying the job requirements Administrators: describing local use policies Workload management software: monitoring the state of the cluster, scheduling the jobs and tracking the resource usage Some or all the following activities are performed Queuing Scheduling Monitoring Resource management Accounting 19

20 Queuing Job submission usually consists of two primary parts: Resource requirements (e.g. the amount of memory, the number of CPUs needed) Job description (e.g. job name, the location of the required input files) Once submitted, the jobs are held in the queue until the matching resources are available 20

21 Scheduling Determining at what time a job should be put into execution on which resources There are a variety of metrics to measure scheduling performance System-oriented metrics (e.g. throughput, utilisation, average response time of all jobs) user-oriented metrics (e.g. response time of a job submitted by a user) They can contradicts each other and balance needs to be made 21

22 Monitoring providing information to administrators, users and the scheduling system on the status of jobs and resources the method of collection may differ between different workload management systems, but the general purposes are the same 22

23 Resource management Handling the details of Starting a job under the identity of the user Stopping a job Cleaning up the mess left behind after the job either completes or is aborted Removing or adding resources For the batch system, the jobs are put into execution in such a way that the users need not be present during execution For interactive systems, the users have to be present to supply arguments or information during the execution of the jobs. 23

24 Accounting Accounting for which users are using what resources for how long Collecting resource usage data (e.g. job owner, resources requested by the job, total amount of resources consumed by the job) Accounting data can be used for: Producing system usage and user usage reports Tuning the scheduling policy Calculating future resource allocations Anticipating future resource requirements by users Determining the area of improvement within the cluster 24

25 PBS PBS, Portable Batch System, is a flexible workload management and job scheduling system Originally developed at NASA Different versions of PBS OpenPBS PBSpro Torque Three key system demons pbs_server: run in the head node; is the centre of PBS pbs_mom: run in computing nodes; actually place the job into execution pbs_sched: scheduling jobs 25

26 PBS PBS job submission script #!/bin/sh #PBS -l walltime=1:00:00 #PBS -l mem=400mb #PBS -l ncpus=4 cd ${HOME}/PBS/test mpirun -np 4 myprogram Submitting a job % qsub myscriptfile Inquiring the status of a job % qstat Delete a job %qdel

27 Maui By Maui high-performance computing centre and other partners A job scheduler that can interact with a number of different resource managers (e.g. PBS) Maui is an external scheduler, meaning it does not include a resource manager but rather extends the capabilities of the existing resource managers the underlying resource manager continues to maintain responsibility for managing nodes and tracking jobs Maui uses the APIs of other resource managers (e.g. PBS) to obtain system information Maui controls the decisions of when, where, and how jobs will run 27

28 Schedule Policies The simplest policy: First-Come First-Served Jobs are initiated in the same order as they are submitted. Does not require prior knowledge about tasks (e.g. runtime). Problems: jobs can block other jobs from starting, despite there being no performance benefit to either user. 28

29 First-Come First-Served 29

30 Backfilling The problem with FCFS is that idle time (sum of unused processing intervals) can be significant. One improvement is to backfill. Allows a job to start if it does not delay the first job in the queue. 30

31 Backfilling 31

32 Backfilling Advantages: Utilisation is improved. Disadvantages: Information about the job execution time is required. User estimation are usually inaccurate. It is a policy decision to decide what to do if a job overruns; many administrators choose to terminate a job if it exceeds its allocated execution time otherwise some users may deliberately underestimate the job length to get an earlier job start time. 32

33 Backfilling a problem if predicted runtime is wrong: 33

34 Scheduling Policies Reservation: Increasingly user-based quality of service (QoS) is an important scheduling metric. In addition to normal scheduling, reservation services can be used to plan resource allocation. Users are able to set up a reserved block of processing capability that they are able to use at some point in the future. Task management system agrees to the reservation. Users are subsequently able to run jobs within their reservation quotient. 34

35 Coursework seminars Just remind you that the partition of the coursework seminar groups is on my homepage. Start doing your coursework as early as possible Make sure you go to consult your seminar tutors if you have problems with your coursework 35

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Outline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers

Outline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers Outline Execution Environments for Parallel Applications Master CANS 2007/2008 Departament d Arquitectura de Computadors Universitat Politècnica de Catalunya Supercomputers OS abstractions Extended OS

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history

More information

OpenPBS Users Manual

OpenPBS Users Manual How to Write a PBS Batch Script OpenPBS Users Manual PBS scripts are rather simple. An MPI example for user your-user-name: Example: MPI Code PBS -N a_name_for_my_parallel_job PBS -l nodes=7,walltime=1:00:00

More information

BlueGene/L (No. 4 in the Latest Top500 List)

BlueGene/L (No. 4 in the Latest Top500 List) BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE 3/3/205 EE 4683/5683: COMPUTER ARCHITECTURE Lecture 8: Interconnection Networks Avinash Kodi, kodi@ohio.edu Agenda 2 Interconnection Networks Performance Metrics Topology 3/3/205 IN Performance Metrics

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI Systems Andrew Gustafson The Machines at MSI Machine Type: Cluster Source: http://en.wikipedia.org/wiki/cluster_%28computing%29 Machine Type: Cluster

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Resource allocation and utilization in the Blue Gene/L supercomputer

Resource allocation and utilization in the Blue Gene/L supercomputer Resource allocation and utilization in the Blue Gene/L supercomputer Tamar Domany, Y Aridor, O Goldshmidt, Y Kliteynik, EShmueli, U Silbershtein IBM Labs in Haifa Agenda Blue Gene/L Background Blue Gene/L

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects

1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects Overview of Interconnects Myrinet and Quadrics Leading Modern Interconnects Presentation Outline General Concepts of Interconnects Myrinet Latest Products Quadrics Latest Release Our Research Interconnects

More information

QLogic TrueScale InfiniBand and Teraflop Simulations

QLogic TrueScale InfiniBand and Teraflop Simulations WHITE Paper QLogic TrueScale InfiniBand and Teraflop Simulations For ANSYS Mechanical v12 High Performance Interconnect for ANSYS Computer Aided Engineering Solutions Executive Summary Today s challenging

More information

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET CRAY XD1 DATASHEET Cray XD1 Supercomputer Release 1.3 Purpose-built for HPC delivers exceptional application performance Affordable power designed for a broad range of HPC workloads and budgets Linux,

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

Top500 Supercomputer list

Top500 Supercomputer list Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimised Programming Preliminary discussion, 17.7.2007 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de Dipl.-Geophys.

More information

Moab Workload Manager on Cray XT3

Moab Workload Manager on Cray XT3 Moab Workload Manager on Cray XT3 presented by Don Maxwell (ORNL) Michael Jackson (Cluster Resources, Inc.) MOAB Workload Manager on Cray XT3 Why MOAB? Requirements Features Support/Futures 2 Why Moab?

More information

Parallel Computer Architecture II

Parallel Computer Architecture II Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de

More information

A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Scalable Ethernet Clos-Switches. Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH

Scalable Ethernet Clos-Switches. Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH Scalable Ethernet Clos-Switches Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH Outline Motivation Clos-Switches Ethernet Crossbar Switches

More information

Answers to Federal Reserve Questions. Training for University of Richmond

Answers to Federal Reserve Questions. Training for University of Richmond Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node

More information

The Optimal CPU and Interconnect for an HPC Cluster

The Optimal CPU and Interconnect for an HPC Cluster 5. LS-DYNA Anwenderforum, Ulm 2006 Cluster / High Performance Computing I The Optimal CPU and Interconnect for an HPC Cluster Andreas Koch Transtec AG, Tübingen, Deutschland F - I - 15 Cluster / High Performance

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short

More information

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

Regional & National HPC resources available to UCSB

Regional & National HPC resources available to UCSB Regional & National HPC resources available to UCSB Triton Affiliates and Partners Program (TAPP) Extreme Science and Engineering Discovery Environment (XSEDE) UCSB clusters https://it.ucsb.edu/services/supercomputing

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines Götz Waschk Technical Seminar, Zeuthen April 27, 2010 > Introduction > Hardware Infiniband

More information

SC2002, Baltimore (http://www.sc-conference.org/sc2002) From the Earth Simulator to PC Clusters

SC2002, Baltimore (http://www.sc-conference.org/sc2002) From the Earth Simulator to PC Clusters SC2002, Baltimore (http://www.sc-conference.org/sc2002) From the Earth Simulator to PC Clusters Structure of SC2002 Top500 List Dinosaurs Department Earth simulator US -answers (Cray SX1, ASCI purple),

More information

Queue systems. and how to use Torque/Maui. Piero Calucci. Scuola Internazionale Superiore di Studi Avanzati Trieste

Queue systems. and how to use Torque/Maui. Piero Calucci. Scuola Internazionale Superiore di Studi Avanzati Trieste Queue systems and how to use Torque/Maui Piero Calucci Scuola Internazionale Superiore di Studi Avanzati Trieste March 9th 2007 Advanced School in High Performance Computing Tools for e-science Outline

More information

Introduc)on to Hyades

Introduc)on to Hyades Introduc)on to Hyades Shawfeng Dong Department of Astronomy & Astrophysics, UCSSC Hyades 1 Hardware Architecture 2 Accessing Hyades 3 Compu)ng Environment 4 Compiling Codes 5 Running Jobs 6 Visualiza)on

More information

COSC 6374 Parallel Computation. Parallel Computer Architectures

COSC 6374 Parallel Computation. Parallel Computer Architectures OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Spring 2010 Flynn s Taxonomy SISD:

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

Future Trends in Hardware and Software for use in Simulation

Future Trends in Hardware and Software for use in Simulation Future Trends in Hardware and Software for use in Simulation Steve Feldman VP/IT, CD-adapco April, 2009 HighPerformanceComputing Building Blocks CPU I/O Interconnect Software General CPU Maximum clock

More information

ABySS Performance Benchmark and Profiling. May 2010

ABySS Performance Benchmark and Profiling. May 2010 ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

High Performance Computing - Parallel Computers and Networks. Prof Matt Probert

High Performance Computing - Parallel Computers and Networks. Prof Matt Probert High Performance Computing - Parallel Computers and Networks Prof Matt Probert http://www-users.york.ac.uk/~mijp1 Overview Parallel on a chip? Shared vs. distributed memory Latency & bandwidth Topology

More information

COSC 6374 Parallel Computation. Parallel Computer Architectures

COSC 6374 Parallel Computation. Parallel Computer Architectures OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Edgar Gabriel Fall 2015 Flynn s Taxonomy

More information

and how to use TORQUE & Maui Piero Calucci

and how to use TORQUE & Maui Piero Calucci Queue and how to use & Maui Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 We Are Trying to Solve 2 Using the Manager

More information

Delivering HPC Performance at Scale

Delivering HPC Performance at Scale Delivering HPC Performance at Scale October 2011 Joseph Yaworski QLogic Director HPC Product Marketing Office: 610-233-4854 Joseph.Yaworski@QLogic.com Agenda QLogic Overview TrueScale Performance Design

More information

Cluster Computing. Cluster Architectures

Cluster Computing. Cluster Architectures Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:

More information

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov

More information

Comparing Linux Clusters for the Community Climate System Model

Comparing Linux Clusters for the Community Climate System Model Comparing Linux Clusters for the Community Climate System Model Matthew Woitaszek, Michael Oberg, and Henry M. Tufo Department of Computer Science University of Colorado, Boulder {matthew.woitaszek, michael.oberg}@colorado.edu,

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5

Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5 Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5 Paper IEEE Computer (May 2016) What is DAS? Distributed common infrastructure for Dutch Computer Science Distributed: multiple (4-6) clusters

More information

What is Parallel Computing?

What is Parallel Computing? What is Parallel Computing? Parallel Computing is several processing elements working simultaneously to solve a problem faster. 1/33 What is Parallel Computing? Parallel Computing is several processing

More information

Quick Guide for the Torque Cluster Manager

Quick Guide for the Torque Cluster Manager Quick Guide for the Torque Cluster Manager Introduction: One of the main purposes of the Aries Cluster is to accommodate especially long-running programs. Users who run long jobs (which take hours or days

More information

Parallel Computing: From Inexpensive Servers to Supercomputers

Parallel Computing: From Inexpensive Servers to Supercomputers Parallel Computing: From Inexpensive Servers to Supercomputers Lyle N. Long The Pennsylvania State University & The California Institute of Technology Seminar to the Koch Lab http://www.personal.psu.edu/lnl

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI for Physical Scientists Michael Milligan MSI Scientific Computing Consultant Goals Introduction to MSI resources Show you how to access our systems

More information

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

Stockholm Brain Institute Blue Gene/L

Stockholm Brain Institute Blue Gene/L Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall

More information

Your Microservice Layout

Your Microservice Layout Your Microservice Layout Data Ingestor Storm Detection Algorithm Storm Clustering Algorithm Storms Exist No Stop UI API Gateway Yes Registry Run Weather Forecast Many of these steps are actually very computationally

More information

Initial Performance Evaluation of the Cray SeaStar Interconnect

Initial Performance Evaluation of the Cray SeaStar Interconnect Initial Performance Evaluation of the Cray SeaStar Interconnect Ron Brightwell Kevin Pedretti Keith Underwood Sandia National Laboratories Scalable Computing Systems Department 13 th IEEE Symposium on

More information

Extremely Fast Distributed Storage for Cloud Service Providers

Extremely Fast Distributed Storage for Cloud Service Providers Solution brief Intel Storage Builders StorPool Storage Intel SSD DC S3510 Series Intel Xeon Processor E3 and E5 Families Intel Ethernet Converged Network Adapter X710 Family Extremely Fast Distributed

More information

APENet: LQCD clusters a la APE

APENet: LQCD clusters a la APE Overview Hardware/Software Benchmarks Conclusions APENet: LQCD clusters a la APE Concept, Development and Use Roberto Ammendola Istituto Nazionale di Fisica Nucleare, Sezione Roma Tor Vergata Centro Ricerce

More information

Infiniband and RDMA Technology. Doug Ledford

Infiniband and RDMA Technology. Doug Ledford Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic

More information

Batch Systems. Running calculations on HPC resources

Batch Systems. Running calculations on HPC resources Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between

More information

Batch Systems. Running your jobs on an HPC machine

Batch Systems. Running your jobs on an HPC machine Batch Systems Running your jobs on an HPC machine Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

The Red Storm System: Architecture, System Update and Performance Analysis

The Red Storm System: Architecture, System Update and Performance Analysis The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI

More information

Cornell Theory Center 1

Cornell Theory Center 1 Cornell Theory Center Cornell Theory Center (CTC) is a high-performance computing and interdisciplinary research center at Cornell University. Scientific and engineering research projects supported by

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

The University of Michigan Center for Advanced Computing

The University of Michigan Center for Advanced Computing The University of Michigan Center for Advanced Computing Andy Caird acaird@umich.edu The University of MichiganCenter for Advanced Computing p.1/29 The CAC What is the Center for Advanced Computing? we

More information

The rcuda middleware and applications

The rcuda middleware and applications The rcuda middleware and applications Will my application work with rcuda? rcuda currently provides binary compatibility with CUDA 5.0, virtualizing the entire Runtime API except for the graphics functions,

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

WhatÕs New in the Message-Passing Toolkit

WhatÕs New in the Message-Passing Toolkit WhatÕs New in the Message-Passing Toolkit Karl Feind, Message-passing Toolkit Engineering Team, SGI ABSTRACT: SGI message-passing software has been enhanced in the past year to support larger Origin 2

More information

Using Quality of Service for Scheduling on Cray XT Systems

Using Quality of Service for Scheduling on Cray XT Systems Using Quality of Service for Scheduling on Cray XT Systems Troy Baer HPC System Administrator National Institute for Computational Sciences, University of Tennessee Outline Introduction Scheduling Cray

More information

Cheese Cluster Training

Cheese Cluster Training Cheese Cluster Training The Biostatistics Computer Committee (BCC) Anjishnu Banerjee Dan Eastwood Chris Edwards Michael Martens Rodney Sparapani Sergey Tarima and The Research Computing Center (RCC) Matthew

More information

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

Cluster Computing. Cluster Architectures

Cluster Computing. Cluster Architectures Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

Storage System. David Southwell, Ph.D President & CEO Obsidian Strategics Inc. BB:(+1)

Storage System. David Southwell, Ph.D President & CEO Obsidian Strategics Inc. BB:(+1) Storage InfiniBand Area Networks System David Southwell, Ph.D President & CEO Obsidian Strategics Inc. BB:(+1) 780.964.3283 dsouthwell@obsidianstrategics.com Agenda System Area Networks and Storage Pertinent

More information

Job Management on LONI and LSU HPC clusters

Job Management on LONI and LSU HPC clusters Job Management on LONI and LSU HPC clusters Le Yan HPC Consultant User Services @ LONI Outline Overview Batch queuing system Job queues on LONI clusters Basic commands The Cluster Environment Multiple

More information

White Paper. Technical Advances in the SGI. UV Architecture

White Paper. Technical Advances in the SGI. UV Architecture White Paper Technical Advances in the SGI UV Architecture TABLE OF CONTENTS 1. Introduction 1 2. The SGI UV Architecture 2 2.1. SGI UV Compute Blade 3 2.1.1. UV_Hub ASIC Functionality 4 2.1.1.1. Global

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

ACCRE High Performance Compute Cluster

ACCRE High Performance Compute Cluster 6 중 1 2010-05-16 오후 1:44 Enabling Researcher-Driven Innovation and Exploration Mission / Services Research Publications User Support Education / Outreach A - Z Index Our Mission History Governance Services

More information

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory Clusters Rob Kunz and Justin Watson Penn State Applied Research Laboratory rfk102@psu.edu Contents Beowulf Cluster History Hardware Elements Networking Software Performance & Scalability Infrastructure

More information

MPICH-G2 performance evaluation on PC clusters

MPICH-G2 performance evaluation on PC clusters MPICH-G2 performance evaluation on PC clusters Roberto Alfieri Fabio Spataro February 1, 2001 1 Introduction The Message Passing Interface (MPI) [1] is a standard specification for message passing libraries.

More information

Using the IAC Chimera Cluster

Using the IAC Chimera Cluster Using the IAC Chimera Cluster Ángel de Vicente (Tel.: x5387) SIE de Investigación y Enseñanza Chimera overview Beowulf type cluster Chimera: a monstrous creature made of the parts of multiple animals.

More information

Parallel Programming with MPI

Parallel Programming with MPI Parallel Programming with MPI Science and Technology Support Ohio Supercomputer Center 1224 Kinnear Road. Columbus, OH 43212 (614) 292-1800 oschelp@osc.edu http://www.osc.edu/supercomputing/ Functions

More information

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation

More information

Linux Clusters for High- Performance Computing: An Introduction

Linux Clusters for High- Performance Computing: An Introduction Linux Clusters for High- Performance Computing: An Introduction Jim Phillips, Tim Skirvin Outline Why and why not clusters? Consider your Users Application Budget Environment Hardware System Software HPC

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays This whitepaper describes Dell Microsoft SQL Server Fast Track reference architecture configurations

More information

PBS Pro Documentation

PBS Pro Documentation Introduction Most jobs will require greater resources than are available on individual nodes. All jobs must be scheduled via the batch job system. The batch job system in use is PBS Pro. Jobs are submitted

More information

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution Product Group Dell White Paper February 28 Contents Contents Introduction... 3 Solution Components... 4

More information

Voltaire Making Applications Run Faster

Voltaire Making Applications Run Faster Voltaire Making Applications Run Faster Asaf Somekh Director, Marketing Voltaire, Inc. Agenda HPC Trends InfiniBand Voltaire Grid Backbone Deployment examples About Voltaire HPC Trends Clusters are the

More information

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo

More information

Lecture 20: Distributed Memory Parallelism. William Gropp

Lecture 20: Distributed Memory Parallelism. William Gropp Lecture 20: Distributed Parallelism William Gropp www.cs.illinois.edu/~wgropp A Very Short, Very Introductory Introduction We start with a short introduction to parallel computing from scratch in order

More information

The Tofu Interconnect D

The Tofu Interconnect D The Tofu Interconnect D 11 September 2018 Yuichiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouichi Hirai, Toshiyuki Shimizu, Shinya Hiramoto, Yoshiro Ikeda, Takahide Yoshikawa, Kenji

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information