High Performance Computing (HPC) Prepared By: Abdussamad Muntahi Muhammad Rahman

Similar documents
PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

Altos T310 F3 Specifications

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

The Optimal CPU and Interconnect for an HPC Cluster

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Acer AT110 F2 Specifications

Answers to Federal Reserve Questions. Administrator Training for University of Richmond

Cisco Connected Safety and Security UCS C220

Altos R320 F3 Specifications. Product overview. Product views. Internal view

Suggested use: infrastructure applications, collaboration/ , web, and virtualized desktops in a workgroup or distributed environments.

Cisco HyperFlex HX220c M4 Node

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

Acer AW2000h w/aw170h F2 Specifications

STAR-CCM+ Performance Benchmark and Profiling. July 2014

TECHNICAL SPECIFICATIONS + TECHNICAL OFFER

IBM System x3455 AMD Opteron SMP 1 U server features Xcelerated Memory Technology to meet the needs of HPC environments

Extremely Fast Distributed Storage for Cloud Service Providers

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

The power of centralized computing at your fingertips

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

GROMACS (GPU) Performance Benchmark and Profiling. February 2016

Feedback on BeeGFS. A Parallel File System for High Performance Computing

3331 Quantifying the value proposition of blade systems

Acer AR320 F1 specifications

vstart 50 VMware vsphere Solution Specification

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Intel Xeon E v4, Windows Server 2016 Standard, 16GB Memory, 1TB SAS Hard Drive and a 3 Year Warranty

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

DESCRIPTION GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage teraflops About ScaleMP

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Fujitsu PRIMERGY TX100 S2 Server

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

HP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture

Cisco MCS 7845-H1 Unified CallManager Appliance

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Pinnacle3 Professional

Cisco MCS 7828-I5 Unified Communications Manager Business Edition 5000 Appliance

Acer AC100 Server Specifications

Cisco UCS C200 M2 High-Density Rack-Mount Server


Acer AR320 F2 Specifications

GW2000h w/gw175h/q F1 specifications

Manager Appliance Quick Start Guide

NEC Express5800/B120f-h System Configuration Guide

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen

VI-CENTER EXTENDED ENTERPRISE EDITION GETTING STARTED GUIDE. Version: 4.5

High Performance Computing

Storage Adapter Testing Report

Cisco MCS 7825-I1 Unified CallManager Appliance

VMware vsphere Storage Appliance Installation and Configuration

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

IBM BladeCenter HS21 blades offer optimum performance efficiency with quad-core processor performance

p5 520 server Robust entry system designed for the on demand world Highlights

Hardware & Software Specification i2itracks/popiq

Cisco HyperFlex HX220c Edge M5

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

ACER INDIA VERITON M200-A780 QUICK SPEC

How to Build a Cluster

eslim SV Xeon 2U Server

Database Solutions Engineering. Best Practices for running Microsoft SQL Server and Microsoft Hyper-V on Dell PowerEdge Servers and Storage

COURSE OUTLINE AND SCHEDULE: ASSIGNMENTS ARE DUE EACH MONDAY WEDNESDAY JANUARY Computing Overview

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

ANNEX II + III : TECHNICAL SPECIFICATIONS + TECHNICAL OFFER

Getting Started with ESX Server 3i Installable Update 2 and later for ESX Server 3i version 3.5 Installable and VirtualCenter 2.5

Enterprise Network Compute System (ENCS)

Cisco Prime Home 6.X Minimum System Requirements: Standalone and High Availability

The MOSIX Scalable Cluster Computing for Linux. mosix.org

Acer AT310 F2 Specifications

IBM System p5 185 Express Server

The PowerEdge 6450 systems offer the following major features:

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Architecting Storage for Semiconductor Design: Manufacturing Preparation

IBM Emulex 16Gb Fibre Channel HBA Evaluation

Adobe Acrobat Connect Pro 7.5 and VMware ESX Server

HP ProLiant DL360 G7 Server - Overview

Acer AR320F2 ACR-USR80ST092. Intel Xeon E3-1230v2 Quad Core Processor (3.30GHz, 5 GT/s DMI) 1 8M L3 Cache 4GB (1x4GB) UDIMM DDR3 1066

<Insert Picture Here> Exadata Hardware Configurations and Environmental Information

vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5

SCHOOL OF PHYSICAL, CHEMICAL AND APPLIED SCIENCES

SUSE OpenStack Cloud Production Deployment Architecture. Guide. Solution Guide Cloud Computing.

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly.

Processor 2 Quad Core Intel Xeon Processor. E GHz Memory (std/max) 2048MB/48GB Slots and bays. SAS Controller, 0 installed 1.

Autodesk Revit Structure 2012 System Requirements and Recommendations. Minimum: Entry-level configuration. Operating System Microsoft Windows 7 32-bit

ProLiant DL F100 Integrated Cluster Solutions and Non-Integrated Cluster Bundle Configurations. Configurations

Reduce Costs & Increase Oracle Database OLTP Workload Service Levels:

M1032W Engineering Workstation. The list below shows components included in an M1032W Engineering Workstation container.

EMC CLARiiON CX3 Series FCP

Cisco MCS 7815-I2. Serviceable SATA Disk Drives

Department of Mechanical Engineering, Indian Institute of Technology Madras Chennai 36

ISILON X-SERIES. Isilon X210. Isilon X410 ARCHITECTURE SPECIFICATION SHEET Dell Inc. or its subsidiaries.

IT Business Management System Requirements Guide

Cisco MCS 7815-I2 Unified CallManager Appliance

Cisco UCS C210 M1 General-Purpose Rack-Mount Server

OpenFOAM Performance Testing and Profiling. October 2017

IBM System p5 550 and 550Q Express servers

Transcription:

High Performance Computing (HPC) Prepared By: Abdussamad Muntahi Muhammad Rahman 1

2 Introduction to High Performance Computing (HPC)

Introduction High-speed computing. Originally pertaining only to supercomputers for scientific research Tools and systems used to implement and create high performance computing systems Used for scientific research or computational science Main area of discipline is developing parallel processing algorithms and software so that programs can be divided into small parts and can be executed simultaneously by separate processors HPC systems have shifted from supercomputing to computing clusters 3

What is Cluster? 4 Cluster is a group of machines interconnected in a way that they work together as a single system Used for better speed and capacity Types of Cluster o High-availability (HA) clusters o Load-balancing clusters o Grid computing Terminology o Node individual machine in a cluster o Head node connected to both the private network of the cluster and a public network and are used to access a given cluster. Responsible for providing user an environment to work and distributing task among other nodes o Computer nodes connected to only the private network of the cluster and are generally used for running jobs assigned to them by the head node(s)

Benefits of Cluster Reduced Cost o The price of off-the-shelf consumer desktops has plummeted in recent years, and this drop in price has corresponded with a vast increase in their processing power and performance. The average desktop PC today is many times more powerful than the first mainframe computers. Processing Power o The parallel processing power of a high-performance cluster can, in many cases, prove more cost effective than a mainframe with similar power. This reduced price-per-unit of power enables enterprises to get a greater ROI (Return On Investment) from their IT budget. Scalability o Perhaps the greatest advantage of computer clusters is the scalability they offer. While mainframe computers have a fixed processing capacity, computer clusters can be easily expanded as requirements change by adding additional nodes to the network. 5

Benefits of Cluster Improved Network Technology o Driving the development of computer clusters has been a vast improvement in the technology related to networking, along with a reduction in the price of such technology. o In clusters, computers are typically connected via a single virtual local area network (VLAN), and the network treats each computer as a separate node. Information can be passed throughout these networks with very little lag, ensuring that data doesn t bottleneck between nodes. Availability o When a mainframe computer fails, the entire system fails. However, if a node in a computer cluster fails, its operations can be simply transferred to another node within the cluster, ensuring that there is no interruption in service. 6

Application of HPC Used to solve complex modeling problems in a spectrum of disciplines Topics include: o Artificial intelligence o Climate modeling o Cryptographic analysis o Geophysics o Molecular biology o Molecular dynamics o Nuclear physics o Physical oceanography o Plasma physics o Quantum physics o Quantum chemistry o Solid state physics o Structural dynamics. HPC is currently applied to business uses as well o data warehouses o line-of-business (LOB) applications o transaction processing 7

Top 10 Supercomputers for HPC June 2011 8 Copyright (c) 2000-2009 TOP500.Org All trademarks and copyrights on this page are owned by their respective owners

9 Typical Cluster

Fastest Supercomputer in USA: Jaguar @ Oak Ridge National Lab 10 http://computing.ornl.gov

Hardware and Our Cluster SIHPCI (maxwell) 11

12 Cluster Architecture: Typical

13 Cluster Architecture: Maxwell

The Cluster: maxwell Tech Specs: No. of nodes: 106 Each node is a Intel dual CPU Quad Core 2.3 GHz Processor Total No. of cores: 848 RAM per node: 8 GB Storage Memory: 90 TB 14

Hardware: Master/Head Node Head node is responsible for providing user an environment to work and distributing task among other nodes Minimum Specification o o o o o CPU of i586 or above A network interface card that supports a TCP/IP stack At least 4GB total free space 2GB under and 2GB under /var A Floppy Drive A CD-Rom Drive Front End 15 Back End

Hardware: Master/Head Node Maxwell Specification o Server format Rack o CPU family Intel Xeon o CPU nominal frequency 2.26GHz o Processor Model Xeon E5520 Front End o Processors supplied 2 Quad core o Memory RAM capacity 24GB Memory (6x4GB), o Memory type DDR3 o Memory frequency 1066MHz Quad Ranked RDIMMs o Storage HDD 146GB 15K RPM Serial-Attach SCSI o RAID module PERC 6/i SAS RAID Controller 2x4 Connectors o Gigabit LAN ports 2 o Power supply rating 480W o Idle power consumption 150W o Peak power consumption 270W o OS Red Hat Enterprise Linux 53AP x32 And x64 16

Hardware: Computing Node (Client) Dedicated for Computation Minimum Specification o CPU of i586 or above o A disk on each client node, at least 2GB in size o A network interface card that supports a TCP/IP stack o All clients must have the same architecture (e.g., ia32 vs. ia64) o Monitors and keyboards may be helpful, but are not required o Floppy or PXE enable BIOS o A CD-Rom Drive Memory slot Two Quad Core processors Memory slot 17

Hardware: Computing Node (Client) Maxwell Specification o CPU family Intel Xeon o CPU nominal frequency 2.13GHz o Processors supplied 2 quad core o Memory RAM capacity 8GB Memory (4x2GB) o Memory type DDR3 o Memory frequency 1333MHz Dual Ranked UDIMMs o storage HDD 160GB 7.2K RPM SATA o Gigabit LAN ports 2 o Power supply rating 480W o Idle power consumption 115W o Peak power consumption 188W Front End o OS Red Hat Linux 5 HPC 18

Hardware: Switch Minimum Specification The switch is necessary for communication between the nodes Each node (including the head node) should have its own port on the switch. In other words, if there are one head node and 8 client nodes, you need at a minimum a 9-port switch Switch 19

Hardware: Switch Maxwell Specification o Model: Power Connect 6248 o Port: 48 10/100/1000BASE-T auto-sensing Gigabit Ethernet switching ports o 48 GbE(giga bit ethernet) Port Managed Switch, two 10GbE and Stacking Capable Power Connect 6248 Switch Stack 20

Hardware: Power Distribution Unit APC Switched Rack Power Distribution Units (PDUs) place rack equipment power control in the hands of the IT Manager. Remote outlet level controls allow power on/off functionality for power recycling to reboot locked-up equipment and to avoid unauthorized use of individual outlets. Power sequencing delays allow users to define the order in which to power up or down attached equipment. Avoid circuit overload during power recovery and extend uptime of critical equipment by prioritizing the load shedding. o PDU plug type L6-30P o PDU Model APC AP7541 o PDU Max Amperage Load 30 21

Hardware: External Storage Array Minimum Specification: o Model Power Vault MD1000 Hard Drive o Max Supported Capacity 1.1 TB o Host Channels 2 o Data Transfer Rate 300 MBPs o Supported Devices Hard drive, Disk array (RAID) o Spindle Speed 15000 RPM Front Side Maxwell Specification o Total storage array 6 o In Each Storage Array 15 HDD o Each HDD has 1*1 TB o Total Storage Capacity 6*15*1.1 TB Back Side 22

Hardware: KVM Switch KVM (Keyboard Video Mouse) Switch is a device used to connect a keyboard, mouse and monitor to two or more computers. KVM switches save money, time, space, equipment and power. These switches are also widely deployed to control pools of servers in data centers. Some KVM switches support user terminals at both ends that allow local and remote access to all the computers or servers. 23

Hardware: Networking Clusters are interconnected with both GigE (Dell PowerConnect 6248 and 48 GbE PortManaged Switch, 2xDell PowerConnect 3424 24 Port FE with 2 GbE Copper Ports and 2 GbE Fiber SFP Ports and Infiniband (Dell 24-Port Internally Managed 9024 DDR InfiniBand Edge Switch) Switches and cards 24 Switch Ports Cable connectivity

25 Maxwell (Front)

26 Maxwell (Back)

27 Software for High Performance Computing (HPC)

Software for HPC For effective use of cluster for HPC the following tools are at our disposal Remote hardware management Remote power on/off Monitoring CPU (for temperature etc.) Cluster management Monitoring programs, system administration etc. Job scheduling Libraries/languages for parallel programming Massage Passing Interface (MPI) 28

Cluster Management Cluster management software offers Easy-to-use interface for managing clusters Automates the process of queuing jobs Matching the requirements of a job and the resources available to the cluster Migrating jobs across the cluster Maxwell uses Red Hat Enterprise Linux 29

Cluster Management Red Hat Enterprise Linux Specially for the scientific computing purpose to deploy clusters of systems that work together Excellent hardware detection and monitoring capabilities Centralized authentication and logging services Fast IO (Input/Output) 30

Parallel Computing Form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently i.e. "in parallel" Different forms of parallel computing Bit-level parallelism Instruction level parallelism Data parallelism Task parallelism Parallel Computer classification Multiple processing elements (multi-core and multi-processor) within a single machine Using multiple computers to work on the same task - clusters, MPPs (Massive Parallel Processing), and grids 31

Parallel Programming Parallel computer programs are more difficult to write than sequential programs Potential problems Race condition (output depending on sequence or timing of other events) Communication and synchronization between the different subtasks HPC Parallel Programming Models associated with different computing technology Single Instruction Multiple Data (SIMD) on Single Processors Multi-Process and Multi-Threading on SMP (symmetric multiprocessing) Computers Message Passing Interface (MPI) on Clusters 32

Parallel Programming 33 Message Passing Interface (MPI) An application programming interface (API) specification that allows processes to communicate with one another by sending and receiving messages Now a de facto standard for parallel programs running on distributed memory systems in computer clusters and supercomputers A massage passing API with language-independent protocol and semantic specifications Support both point-to-point and collective communication Goals are high performance, scalability, and portability Consists of a specific set of routines (i.e. APIs) directly callable from C, C++, Fortran and any language able to interface with such libraries, including C#, Java or Python

34 Tutorial on Maxwell

Maxwell: A Brief Introduction Tech Specs: No. of nodes: 106 Each node is a Intel dual CPU Quad Core 2.3 GHz Processor Total No. of cores: 848 RAM per node: 8 GB Storage Memory: 90 TB 35

How to create an account? Send an email to Nancy Beasley nancyj0@siu.edu or Dr. Shaikh Ahmed ahmed@siu.edu Provide the following information Name Affiliation IP address of the computer(s) on SIUC network from which you would access Maxwell Will receive an email with Log In information 36

How to create an account? IP address look-up Go to Start 37

How to create an account? IP address look-up Type in cmd to go to command prompt 38

How to create an account? IP address look-up 39

How to create an account? IP address look-up Type in ipconfig to get the IP Address 40

How to create an account? IP address look-up IP Address 41

Login Procedure Download Putty Web addresses http://www.putty.org/ http://download.cnet.com/putty/3000-7240_4-10808581.html Run Putty Use Host Name or IP address of Maxwell Host Name: maxwell.ecehpc.siuc.edu Enable X11 42

Login Procedure Run Putty Start Session Type in Host Name: maxwell.ecehpc.siuc.edu 43

Login Procedure Run Putty Start Session Enable X11 Connection > SSH > X11 Check Enable X11 Forwarding 44

Login Procedure Run Putty Start Session Enable X11 Connection > SSH > X11 Open Press Open to start the session 45

Login Procedure Type in Your Login ID Example nanosiuc 46

Login Procedure Type in Password 47

Login Procedure Prompt with your Login ID. Interface for User to use Maxwell 48

MPI Test Copy mpi_test directory to your home directory Type in the following command cp r /home/nanosiuc/mpi_test. 49

Run MPI Type in ls to view the content of your account 50

Run MPI Folder containing files to test MPI 51

Run MPI Change Directory and view the files in mpi_test Open to see the content of these files 52

Run MPI File: bsub.sh job_name output_file run_limit memory_limit processor_use run_time to run a parallel program by MPI Change this (nanosiuc) to your login ID 53

Run MPI File: submit_script.sh Creates a new directory to generate output. Directory name: cpi_test_<number of nodes used> 54

Run MPI Script to Run MPI./submit_script.sh bsub.sh <# of Processors> <input file> <# of Processors> is an integer <input file> is optional. If in different directory, use the path of the input file as well 55

Run MPI Script to Run MPI Running MPI Program <# of processors> Output Directory 56

Run MPI Viewing Output Content of Output Directory Output File 57

Run MPI Viewing Output Content of Output File 58

59 The End