Brutus. Above and beyond Hreidar and Gonzales

Size: px
Start display at page:

Download "Brutus. Above and beyond Hreidar and Gonzales"

Transcription

1 Brutus Above and beyond Hreidar and Gonzales Dr. Olivier Byrde Head of HPC Group, IT Services, ETH Zurich Teodoro Brasacchio HPC Group, IT Services, ETH Zurich 1

2 Outline High-performance computing at ETH (Olivier Byrde) The shareholder model Central clusters and their applications Introducing Brutus A closer look at Brutus (Teodoro Brasacchio) The Brutus platform Key features Next steps (Olivier Byrde) Installation status Integration of Hreidar and Gonzales Extension of Brutus 2

3 Prologue Not so long ago supercomputing was reserved to an elite Supercomputers were very expensive (at least $10M) They were housed in big supercomputer centers A small army was needed to maintain and operate them There was a huge demand for smaller, cheaper systems Some companies made mini-supercomputers ( super-minis ) Others made super-workstations Few of them were commercial successes Then came the Beowulf revolution Started in 1994 by Donald Becker and Thomas Sterling, NASA Quickly adopted by users and vendors (those still in business!) Completely dominates the HPC landscape today 3

4 Part. I High-performance computing at ETH Dr. Olivier Byrde Head of HPC Group, IT Services, ETH Zurich 4

5 Shareholder model Professors who need lots of computing power pool their resources to finance a large, common cluster The IT Services take care of the purchase, maintenance and operation of the cluster Professors receive a share of the cluster proportional to their investment Shares are valid for the whole lifetime of the cluster Advantages Cheaper than buying an individual cluster (economy of scale) No need to worry about maintenance and administration Better utilization of computing resources 5

6 Shareholder model in practice The central clusters of ETH are jointly owned by 22 professors* in 9 departments and the IT Services The share of the IT Services is made available to the whole scientific community of ETH Any member of ETH can request an account on the central clusters Shareholders enjoy special privileges *) Abhari, Aebersold, Anastasiou, Bonhoeffer, Carollo, Govindjee, Gruber, Hiptmair, Katzgraber, Knutti, Koumoutsakos, Kröger, Lilly, Lüthi, Oganov, Öttinger, Parrinello, Pelkmans, Poulikakos, Stelling, Tackley and Troyer 6

7 Shareholders Hreidar Gonzales 7

8 Central clusters Hreidar In operation since AMD Opteron processors, GHz (single-core) Gigabit Ethernet network Red Hat Enterprise Linux 3 Gonzales In operation since AMD Opteron processors, 2.4 GHz (single-core) High-performance Quadrics QsNet II network SuSE Linux 9.2 8

9 Intended use Hreidar Serial applications Parallel applications with little communication Embarrassingly parallel computations (e.g. Monte-Carlo) Gonzales Communication-intensive parallel computations (MPI) 9

10 Observations Many users have only access to one cluster Not necessarily the best system for their applications Those who use both Hreidar and Gonzales must cope with different: Operating systems Compilers, libraries and applications Usage rules and policies These differences prevent an optimal utilization of the resources provided by the central clusters 10

11 Solution Standardize Same operating system on both clusters Same compilers and libraries Common user name and authentication and integrate Merge the existing clusters Work in progress 11

12 Step-by-step approach Develop a new cluster platform from scratch Take the best from Hreidar and Gonzales Learn from our mistakes Implement and test this platform on a new cluster Easier than changing existing clusters on-the-fly (no down-time) Migrate users to the new cluster Beta users first Once the system is proven stable, normal users Move Hreidar and Gonzales to the new platform Software upgrade (OS, cluster management, applications) Hardware integration (network, storage, etc.) 12

13 Introducing the Brutus platform Better Reliability and Usability Thanks to Unified System 13

14 Part. II A closer look at Brutus Teodoro Brasacchio HPC Group, IT Services, ETH Zurich 14

15 Brutus platform Common platform for existing and future clusters Hardware (network, storage, etc.) Software (OS, compilers, libraries, applications) Batch system (policies, queues, fair-share) Simpler to use Single user environment (NETHZ login) Simpler to administer The same staff can manage a much larger system Better user support The less time we need to manage the system, the more time we can spend supporting our users 15

16 Brutus user s view 16

17 Brutus behind the scenes 17

18 Networks Gigabit Ethernet Backbone of the Brutus platform Up to 1200 nodes Quadrics QsNet II High bandwidth (870 MB/s sustained) Low latency (1µs) Up to 512 nodes Management network 18

19 Compute nodes 2008/2 : 280 nodes / 1216 cores 272 nodes with 4 cores (2.8 GHz) 8 fat nodes with 16 cores (2.8 GHz) 2008/4 : 756 nodes / 2168 cores 272 nodes with 4 cores (2.8 GHz) 8 fat nodes with 16 cores (2.8 GHz) 196 nodes with 2 cores ( GHz) ex-hreidar 280 nodes with 2 cores (2.4 GHz) ex-gonzales 2008/6 : 968 nodes / cores Same as 2008/4, plus at least 212 nodes with either 4 cores (2.8 GHz) or 8 cores (2.2 GHz) 19

20 Storage SAN : ~10 TB Storage for home directories Subject to relatively small quota (to be defined) Backed up daily (NetBackup) Panasas : ~40 TB Medium-term storage for work directories Short-term storage (scratch) for very large data sets Extremely fast, ideal for I/O-intensive applications No backup Users can rent additional disk space if needed 20

21 Reliability No single point of failure almost Redundant servers for all critical services Redundant storage (SAN, Panasas) Redundant network (Quadrics) Redundant (uninterruptible) power supply Only exception: Ethernet network 24x7 availability Login always possible even if a login node crashed Files always accessible even if a file server crashed No down-time necessary for regular maintenance Redundant components can be upgraded on-the-fly 21

22 System Single access point : brutus.ethz.ch Single contact for support : cluster-support@id.ethz.ch Central user authentication : NETHZ Operating system : Red Hat Enterprise Linux 5 Batch system : LSF HPC 6.2 Compilers : GCC 4.1, Intel 10.1, PGI

23 Applications Brutus can handle all the applications currently running on Hreidar and Gonzales Serial and embarrassingly parallel Communication-intensive It also opens the door to a new range of applications OpenMP up to 16 threads Commercial applications (CFX, Fluent, MATLAB, etc.) The batch system takes care of the allocation of resources based on an application s requirements 23

24 Teraflops (peak) Peak performance quadcore /2 2008/4 2008/6 Hreidar Gonzales Brutus 24

25 Part. III Next steps Dr. Olivier Byrde Head of HPC Group, IT Services, ETH Zurich 25

26 Disclaimer All the information presented hereafter is valid as of March 5, 2008 Things are changing extremely rapidly Go to for up-to-date information 26

27 History 2006 June first discussions with potential shareholders 2007 January-March definition of the cluster s specifications April-May call for tender June-July evaluation of all offers, selection of winning bid October purchase approved by Executive Board of ETH December hardware delivery and installation 2008 January hardware tests, software installation, alpha users February acceptance tests, beta users March third-party software installation, normal users 27

28 Current status Brutus has passed all basic hardware tests About 5% of the compute nodes failed during these tests All have been repaired or exchanged Ethernet part is functional Software installation and configuration is in progress Compilers, libraries and some applications are available Open to beta users Quadrics part is still in alpha stage Hardware installation is not complete yet Software installation has just started Testing will start immediately thereafter Can be used for benchmarking purposes if needed 28

29 Roadmap (subject to change) March 2008 Interconnection of Quadrics networks of Brutus and Gonzales Gradual opening of Brutus to normal users Ordering of Brutus extension (200+ nodes, Ethernet & Quadrics) April Integration of Hreidar and Gonzales into Brutus (start) Migration of all cluster users to Brutus May Integration of Hreidar and Gonzales into Brutus (end) June Delivery and installation of Brutus extension 29

30 Beta users A lot of software has still to be installed and/or tested Compilers Scientific libraries Batch system MPI and OpenMP applications Third-party applications We are looking for volunteers! Please contact cluster-support@id.ethz.ch to apply for a beta user account 30

31 User migration Hreidar users must request a new account Some users will have a different username on Brutus External users may need to apply for a NETHZ account first About 50 applications have been received so far Gonzales users will get an account automatically if Their NETHZ account is still valid They ran jobs on Gonzales in the last 6 months About 150 users meet these requirements New users New accounts will be created once Brutus is fully operational In the meantime new users may apply for a beta user account 31

32 Very important! Hreidar and Gonzales will cease to exist as independent clusters in May 2008 It is your responsibility to verify that all your applications will run on Brutus We will be happy to help you, but we cannot do all the work for you Do not wait until the last minute! 32

33 Brutus extension We are preparing the next cluster extension and expect to place an order in March 2008 Due to power constraints, this will be the only extension this year We have already firm commitments for over 200 nodes (Ethernet and Quadrics) About 50 nodes are still up for grabs If you would like to take this opportunity to become a shareholder or to increase your share, please contact Olivier Byrde, byrde@id.ethz.ch 33

34 Thank You 34

Leonhard: a new cluster for Big Data at ETH

Leonhard: a new cluster for Big Data at ETH Leonhard: a new cluster for Big Data at ETH Bernd Rinn, Head of Scientific IT Services Olivier Byrde, Group leader High Performance Computing Bernd Rinn & Olivier Byrde 2017-02-15 1 Agenda Welcome address

More information

The Optimal CPU and Interconnect for an HPC Cluster

The Optimal CPU and Interconnect for an HPC Cluster 5. LS-DYNA Anwenderforum, Ulm 2006 Cluster / High Performance Computing I The Optimal CPU and Interconnect for an HPC Cluster Andreas Koch Transtec AG, Tübingen, Deutschland F - I - 15 Cluster / High Performance

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory Clusters Rob Kunz and Justin Watson Penn State Applied Research Laboratory rfk102@psu.edu Contents Beowulf Cluster History Hardware Elements Networking Software Performance & Scalability Infrastructure

More information

ACCRE High Performance Compute Cluster

ACCRE High Performance Compute Cluster 6 중 1 2010-05-16 오후 1:44 Enabling Researcher-Driven Innovation and Exploration Mission / Services Research Publications User Support Education / Outreach A - Z Index Our Mission History Governance Services

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly.

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly. Enquiry No: IITK/ME/mkdas/2016/01 May 04, 2016 Quotations invited Sealed quotations are invited for the purchase of an HPC cluster with the specification outlined below. Technical as well as the commercial

More information

Flux: The State of the Cluster

Flux: The State of the Cluster Flux: The State of the Cluster Andrew Caird acaird@umich.edu 7 November 2012 Questions Thank you all for coming. Questions? Andy Caird (acaird@umich.edu, hpc-support@umich.edu) Flux Since Last November

More information

The Red Storm System: Architecture, System Update and Performance Analysis

The Red Storm System: Architecture, System Update and Performance Analysis The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc.

Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Computer Aided Engineering with Today's Multicore, InfiniBand-Based Clusters 2006 ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary Our Business Simulation Driven Product Development Deliver superior

More information

HPC at UZH: status and plans

HPC at UZH: status and plans HPC at UZH: status and plans Dec. 4, 2013 This presentation s purpose Meet the sysadmin team. Update on what s coming soon in Schroedinger s HW. Review old and new usage policies. Discussion (later on).

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

InfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points

InfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points InfoBrief Platform ROCKS Enterprise Edition Dell Cluster Software Offering Key Points High Performance Computing Clusters (HPCC) offer a cost effective, scalable solution for demanding, compute intensive

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

Managing complex cluster architectures with Bright Cluster Manager

Managing complex cluster architectures with Bright Cluster Manager Managing complex cluster architectures with Bright Cluster Manager Christopher Huggins www.clustervision.com 1 About ClusterVision Specialists in Compute, Storage & Database Clusters (Tailor-Made, Turn-Key)

More information

The Use of Cloud Computing Resources in an HPC Environment

The Use of Cloud Computing Resources in an HPC Environment The Use of Cloud Computing Resources in an HPC Environment Bill, Labate, UCLA Office of Information Technology Prakashan Korambath, UCLA Institute for Digital Research & Education Cloud computing becomes

More information

Introduction to High Performance Computing at ZIH

Introduction to High Performance Computing at ZIH Center for Information Services and High Performance Computing (ZIH) Introduction to High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Zellescher Weg 12 Trefftz-Bau/HRSK 151 Phone

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

Designing a Cluster for a Small Research Group

Designing a Cluster for a Small Research Group Designing a Cluster for a Small Research Group Jim Phillips, John Stone, Tim Skirvin Low-cost Linux Clusters for Biomolecular Simulations Using NAMD Outline Why and why not clusters? Consider your Users

More information

Updating the HPC Bill Punch, Director HPCC Nov 17, 2017

Updating the HPC Bill Punch, Director HPCC Nov 17, 2017 Updating the HPC 2018 Bill Punch, Director HPCC Nov 17, 2017 Unique Opportunity The plan for HPC and the new data center is to stand up a new system in the DC, while maintaining the old system for awhile

More information

Cluster Network Products

Cluster Network Products Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster

More information

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING Contents Scientifi c Supercomputing Center Karlsruhe (SSCK)... 4 Consultation and Support... 5 HP XC 6000 Cluster at the SSC Karlsruhe... 6 Architecture

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

Readme for Platform Open Cluster Stack (OCS)

Readme for Platform Open Cluster Stack (OCS) Readme for Platform Open Cluster Stack (OCS) Version 4.1.1-2.0 October 25 2006 Platform Computing Contents What is Platform OCS? What's New in Platform OCS 4.1.1-2.0? Supported Architecture Distribution

More information

The Why and How of HPC-Cloud Hybrids with OpenStack

The Why and How of HPC-Cloud Hybrids with OpenStack The Why and How of HPC-Cloud Hybrids with OpenStack OpenStack Australia Day Melbourne June, 2017 Lev Lafayette, HPC Support and Training Officer, University of Melbourne lev.lafayette@unimelb.edu.au 1.0

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration An Oracle White Paper December 2010 Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration Introduction...1 Overview of the Oracle VM Blade Cluster

More information

Standard Service Level Agreement (Service based SLA) for Scientific Compute Clusters

Standard Service Level Agreement (Service based SLA) for Scientific Compute Clusters Informatikdiens IT Services Direction ETH Zürich Stampfenbachstrasse 69 8092 Zürich www.id.ethz.ch Standard Service Level Agreement (Service based SLA) for Scientific Compute Clusters Table of Contents

More information

Design and Evaluation of a 2048 Core Cluster System

Design and Evaluation of a 2048 Core Cluster System Design and Evaluation of a 2048 Core Cluster System, Torsten Höfler, Torsten Mehlan and Wolfgang Rehm Computer Architecture Group Department of Computer Science Chemnitz University of Technology December

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

CS500 SMARTER CLUSTER SUPERCOMPUTERS

CS500 SMARTER CLUSTER SUPERCOMPUTERS CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer

More information

Parallel File Systems Compared

Parallel File Systems Compared Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features

More information

Batch Scheduling on XT3

Batch Scheduling on XT3 Batch Scheduling on XT3 Chad Vizino Pittsburgh Supercomputing Center Overview Simon Scheduler Design Features XT3 Scheduling at PSC Past Present Future Back to the Future! Scheduler Design

More information

Whitepaper / Benchmark

Whitepaper / Benchmark Whitepaper / Benchmark Web applications on LAMP run up to 8X faster with Dolphin Express DOLPHIN DELIVERS UNPRECEDENTED PERFORMANCE TO THE LAMP-STACK MARKET Marianne Ronström Open Source Consultant iclaustron

More information

The Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter

The Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter The Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter CUG 2011, May 25th, 2011 1 Requirements to Reality Develop RFP Select

More information

The AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA

The AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA The AMD64 Technology for Server and Workstation Dr. Ulrich Knechtel Enterprise Program Manager EMEA Agenda Direct Connect Architecture AMD Opteron TM Processor Roadmap Competition OEM support The AMD64

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,

More information

Introduction to iscsi

Introduction to iscsi Introduction to iscsi As Ethernet begins to enter into the Storage world a new protocol has been getting a lot of attention. The Internet Small Computer Systems Interface or iscsi, is an end-to-end protocol

More information

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster

More information

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Windows HPC Server 2008 R2 Windows HPC Server 2008 R2 makes supercomputing

More information

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization 2 Glenn Bresnahan Director, SCV MGHPCC Buy-in Program Kadin Tseng HPC Programmer/Consultant

More information

AN OVERVIEW OF COMPUTING RESOURCES WITHIN MATHS AND UON

AN OVERVIEW OF COMPUTING RESOURCES WITHIN MATHS AND UON AN OVERVIEW OF COMPUTING RESOURCES WITHIN MATHS AND UON 1 PURPOSE OF THIS TALK Give an overview of the provision of computing facilities within Maths and UoN (Theo). When does one realise that should take

More information

AN INTRODUCTION TO CLUSTER COMPUTING

AN INTRODUCTION TO CLUSTER COMPUTING CLUSTERS AND YOU AN INTRODUCTION TO CLUSTER COMPUTING Engineering IT BrownBag Series 29 October, 2015 Gianni Pezzarossi Linux Systems Administrator Mark Smylie Hart Research Technology Facilitator WHAT

More information

The University of Michigan Center for Advanced Computing

The University of Michigan Center for Advanced Computing The University of Michigan Center for Advanced Computing Andy Caird acaird@umich.edu The University of MichiganCenter for Advanced Computing p.1/29 The CAC What is the Center for Advanced Computing? we

More information

Veritas NetBackup on Cisco UCS S3260 Storage Server

Veritas NetBackup on Cisco UCS S3260 Storage Server Veritas NetBackup on Cisco UCS S3260 Storage Server This document provides an introduction to the process for deploying the Veritas NetBackup master server and media server on the Cisco UCS S3260 Storage

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

Day 9: Introduction to CHTC

Day 9: Introduction to CHTC Day 9: Introduction to CHTC Suggested reading: Condor 7.7 Manual: http://www.cs.wisc.edu/condor/manual/v7.7/ Chapter 1: Overview Chapter 2: Users Manual (at most, 2.1 2.7) 1 Turn In Homework 2 Homework

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming January 14, 2015 www.cac.cornell.edu What is Parallel Programming? Theoretically a very simple concept Use more than one processor to complete a task Operationally

More information

Outline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers

Outline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers Outline Execution Environments for Parallel Applications Master CANS 2007/2008 Departament d Arquitectura de Computadors Universitat Politècnica de Catalunya Supercomputers OS abstractions Extended OS

More information

Managing CAE Simulation Workloads in Cluster Environments

Managing CAE Simulation Workloads in Cluster Environments Managing CAE Simulation Workloads in Cluster Environments Michael Humphrey V.P. Enterprise Computing Altair Engineering humphrey@altair.com June 2003 Copyright 2003 Altair Engineering, Inc. All rights

More information

SuperMike-II Launch Workshop. System Overview and Allocations

SuperMike-II Launch Workshop. System Overview and Allocations : System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of

More information

Building 96-processor Opteron Cluster at Florida International University (FIU) January 5-10, 2004

Building 96-processor Opteron Cluster at Florida International University (FIU) January 5-10, 2004 Building 96-processor Opteron Cluster at Florida International University (FIU) January 5-10, 2004 Brian Dennis, Ph.D. Visiting Associate Professor University of Tokyo Designing the Cluster Goal: provide

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing Zheng Meyer-Zhao - zheng.meyer-zhao@surfsara.nl Consultant Clustercomputing Outline SURFsara About us What we do Cartesius and Lisa Architectures and Specifications File systems Funding Hands-on Logging

More information

Organizational Update: December 2015

Organizational Update: December 2015 Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2

More information

Performance comparison between a massive SMP machine and clusters

Performance comparison between a massive SMP machine and clusters Performance comparison between a massive SMP machine and clusters Martin Scarcia, Stefano Alberto Russo Sissa/eLab joint Democritos/Sissa Laboratory for e-science Via Beirut 2/4 34151 Trieste, Italy Stefano

More information

IBM System p5 510 and 510Q Express Servers

IBM System p5 510 and 510Q Express Servers More value, easier to use, and more performance for the on demand world IBM System p5 510 and 510Q Express Servers System p5 510 or 510Q Express rack-mount servers Highlights Up to 4-core scalability with

More information

Parallel Computing: From Inexpensive Servers to Supercomputers

Parallel Computing: From Inexpensive Servers to Supercomputers Parallel Computing: From Inexpensive Servers to Supercomputers Lyle N. Long The Pennsylvania State University & The California Institute of Technology Seminar to the Koch Lab http://www.personal.psu.edu/lnl

More information

Topics. Operating System. What is an Operating System? Let s Get Started! What is an Operating System? Where in the Book are we?

Topics. Operating System. What is an Operating System? Let s Get Started! What is an Operating System? Where in the Book are we? Topics Operating System What is an OS? OS History OS Concepts OS Structures Introduction Let s Get Started! What is an Operating System? What are some OSes you know? Guess if you are not sure Pick an OS

More information

University at Buffalo Center for Computational Research

University at Buffalo Center for Computational Research University at Buffalo Center for Computational Research The following is a short and long description of CCR Facilities for use in proposals, reports, and presentations. If desired, a letter of support

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende Introduction to Cheyenne 12 January, 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical specs of the Cheyenne supercomputer and expanded GLADE file systems The Cheyenne computing

More information

IBM System p5 550 and 550Q Express servers

IBM System p5 550 and 550Q Express servers The right solutions for consolidating multiple applications on a single system IBM System p5 550 and 550Q Express servers Highlights Up to 8-core scalability using Quad-Core Module technology Point, click

More information

How to Use a Supercomputer - A Boot Camp

How to Use a Supercomputer - A Boot Camp How to Use a Supercomputer - A Boot Camp Shelley Knuth Peter Ruprecht shelley.knuth@colorado.edu peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Today we will discuss: Who Research Computing is

More information

Topics. Operating System I. What is an Operating System? Let s Get Started! What is an Operating System? OS History.

Topics. Operating System I. What is an Operating System? Let s Get Started! What is an Operating System? OS History. Topics Operating System I What is an OS? OS History OS Concepts OS Structures Introduction Let s Get Started! What is an Operating System? What are some OSes you know? Pick an OS you know: What are some

More information

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini February 2015 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents

More information

Fast Setup and Integration of Abaqus on HPC Linux Cluster and the Study of Its Scalability

Fast Setup and Integration of Abaqus on HPC Linux Cluster and the Study of Its Scalability Fast Setup and Integration of Abaqus on HPC Linux Cluster and the Study of Its Scalability Betty Huang, Jeff Williams, Richard Xu Baker Hughes Incorporated Abstract: High-performance computing (HPC), the

More information

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016

HPCF Cray Phase 2. User Test period. Cristian Simarro User Support. ECMWF April 18, 2016 HPCF Cray Phase 2 User Test period Cristian Simarro User Support advisory@ecmwf.int ECMWF April 18, 2016 Content Introduction Upgrade timeline Changes Hardware Software Steps for the testing on CCB Possible

More information

Short Note. Cluster building and running at SEP. Robert G. Clapp and Paul Sava 1 INTRODUCTION

Short Note. Cluster building and running at SEP. Robert G. Clapp and Paul Sava 1 INTRODUCTION Stanford Exploration Project, Report 111, June 9, 2002, pages 401?? Short Note Cluster building and running at SEP Robert G. Clapp and Paul Sava 1 INTRODUCTION SEP has always been interested in problems

More information

Windows-HPC Environment at RWTH Aachen University

Windows-HPC Environment at RWTH Aachen University Windows-HPC Environment at RWTH Aachen University Christian Terboven, Samuel Sarholz {terboven, sarholz}@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University PPCES 2009 March

More information

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System Best practices Roland Mueller IBM Systems and Technology Group ISV Enablement April 2012 Copyright IBM Corporation, 2012

More information

RED HAT ENTERPRISE LINUX. STANDARDIZE & SAVE.

RED HAT ENTERPRISE LINUX. STANDARDIZE & SAVE. RED HAT ENTERPRISE LINUX. STANDARDIZE & SAVE. Is putting Contact us INTRODUCTION You know the headaches of managing an infrastructure that is stretched to its limit. Too little staff. Too many users. Not

More information

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational

More information

Practical Scientific Computing

Practical Scientific Computing Practical Scientific Computing Performance-optimized Programming Preliminary discussion: July 11, 2008 Dr. Ralf-Peter Mundani, mundani@tum.de Dipl.-Ing. Ioan Lucian Muntean, muntean@in.tum.de MSc. Csaba

More information

Intra-MIC MPI Communication using MVAPICH2: Early Experience

Intra-MIC MPI Communication using MVAPICH2: Early Experience Intra-MIC MPI Communication using MVAPICH: Early Experience Sreeram Potluri, Karen Tomko, Devendar Bureddy, and Dhabaleswar K. Panda Department of Computer Science and Engineering Ohio State University

More information

UAntwerpen, 24 June 2016

UAntwerpen, 24 June 2016 Tier-1b Info Session UAntwerpen, 24 June 2016 VSC HPC environment Tier - 0 47 PF Tier -1 623 TF Tier -2 510 Tf 16,240 CPU cores 128/256 GB memory/node IB EDR interconnect Tier -3 HOPPER/TURING STEVIN THINKING/CEREBRO

More information

IBM System x family brochure

IBM System x family brochure IBM Systems and Technology System x IBM System x family brochure IBM System x rack and tower servers 2 IBM System x family brochure IBM System x servers Highlights IBM System x and BladeCenter servers

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

High-End Computing Systems

High-End Computing Systems High-End Computing Systems EE380 State-of-the-Art Lecture Hank Dietz Professor & Hardymon Chair in Networking Electrical & Computer Engineering Dept. University of Kentucky Lexington, KY 40506-0046 http://aggregate.org/hankd/

More information

Linux Developments at DESY. Uwe Ensslin, DESY - IT 2003 Jun 30

Linux Developments at DESY. Uwe Ensslin, DESY - IT 2003 Jun 30 Linux Developments at DESY Uwe Ensslin, DESY - IT 2003 Jun 30 Outline DESY Linux Experiences Challenges Developments Outlook 2003-06-30 Uwe Ensslin, DESY IT DV Seminar: Linux Developments at DESY IT Systems

More information

Linux Clusters for High- Performance Computing: An Introduction

Linux Clusters for High- Performance Computing: An Introduction Linux Clusters for High- Performance Computing: An Introduction Jim Phillips, Tim Skirvin Outline Why and why not clusters? Consider your Users Application Budget Environment Hardware System Software HPC

More information

FUJITSU PHI Turnkey Solution

FUJITSU PHI Turnkey Solution FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture

More information

High-Performance Computing at The University of Michigan College of Engineering

High-Performance Computing at The University of Michigan College of Engineering High-Performance Computing at The University of Michigan College of Engineering Andrew Caird acaird@umich.edu October 10, 2006 Who We Are College of Engineering centralized HPC support Been trying this

More information

High Performance Computing The Essential Tool for a Knowledge Economy

High Performance Computing The Essential Tool for a Knowledge Economy High Performance Computing The Essential Tool for a Knowledge Economy Rajeeb Hazra Vice President & General Manager Technical Computing Group Datacenter & Connected Systems Group July 22 nd 2013 1 What

More information

Single-Points of Performance

Single-Points of Performance Single-Points of Performance Mellanox Technologies Inc. 29 Stender Way, Santa Clara, CA 9554 Tel: 48-97-34 Fax: 48-97-343 http://www.mellanox.com High-performance computations are rapidly becoming a critical

More information

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE WHITE PAPER I JUNE 2010 LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE How an Open, Modular Storage Platform Gives Enterprises the Agility to Scale On Demand and Adapt to Constant Change. LEVERAGING A PERSISTENT

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Habanero Operating Committee. January

Habanero Operating Committee. January Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes 3. Storage 4. Network Execute Nodes Type Quantity Standard 176 High Memory 32 GPU* 14 Total 222 Execute Nodes

More information

White Paper. Low Cost High Availability Clustering for the Enterprise. Jointly published by Winchester Systems Inc. and Red Hat Inc.

White Paper. Low Cost High Availability Clustering for the Enterprise. Jointly published by Winchester Systems Inc. and Red Hat Inc. White Paper Low Cost High Availability Clustering for the Enterprise Jointly published by Winchester Systems Inc. and Red Hat Inc. Linux Clustering Moves Into the Enterprise Mention clustering and Linux

More information

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini June 2016 2016 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents

More information

Parallel File Systems. John White Lawrence Berkeley National Lab

Parallel File Systems. John White Lawrence Berkeley National Lab Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation

More information

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002 Category: INFORMATIONAL Grid Scheduling Dictionary WG (SD-WG) M. Roehrig, Sandia National Laboratories Wolfgang Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Philipp Wieder, Research

More information

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments LCI HPC Revolution 2005 26 April 2005 Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments Matthew Woitaszek matthew.woitaszek@colorado.edu Collaborators Organizations National

More information