The Red Storm System: Architecture, System Update and Performance Analysis
|
|
- Mark Cain
- 6 years ago
- Views:
Transcription
1 The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI 2006 Workshop on Performance & Productivity of Extreme-Scale Parallel Systems October 17th, 2006
2 Outline Architecture & Design Upgrade: the numbers Availability & Reliability Application Performance
3 Red Storm Architecture Balanced System Performance: CPU, Memory, Interconnect and I/O Scalability: System Hardware and System Software scale, a single cabinet system to 32K processor system Functional Partitioning: Hardware and System Software Reliability: Full system Reliability, Availability, Serviceability (RAS) Designed into Architecture Upgradeability: Designed in path for system upgrade Red/Black Switching: Flexible support for both classified and unclassified Computing in a single system Custom Packaging: High density, relatively low power system Price/Performance: Excellent performance per dollar, use high volume commodity parts where feasible
4 Red Storm System (pre-upgrade) True MPP, designed to be a single system Distributed memory MIMD parallel supercomputer Fully connected 3-D mesh interconnect. 108 compute node cabinets and 10,368 compute node processors (AMD 2.0 GHz) ~30 TB of DDR compute node memory (4 GB, 3 GB, 2 GB) 8 Service and I/O cabinets on each end (256 processors for each color) ~400 TB of disk storage (~200 TB per color) Less than 2 MW total power and cooling Less than 3,000 ft 2 of floor space
5 Red Storm System Upgrade TeraFLOPs 3D Mesh (compute partition) Nodes/Partition (Red/Center/Black) Compute Memory Processors NIC Pre-Upgrade ~41 TF 27x16x24 256/10,368/256 ~31TB (DDR333) AMD 2.0GHz Seastar V1.2 Post-Upgrade ~125 TF 27x20x24 320/12,960/320 ~78TB (DDR400) AMD Opteron, Seastar V2.1 (~doubles HT bandwidth)
6 Red Storm Layout (post upgrade) ( Compute Node Mesh) Normally Classified Switchable Nodes Normally Unclassified I/O and Service Nodes Disconnect Cabinets Disk storage system not shown I/O and Service Nodes
7 Red Storm System Software Operating Systems LINUX on service and I/O nodes LWK (Catamount) on compute nodes LINUX on RAS nodes File Systems Parallel File System - Lustre Unix File System- Lustre NFS v3 Run-Time System Logarithmic loader Node allocator Batch system PBS Pro Libraries MPI, I/O, Math Single System View Programming Model Message Passing: MPI Support for Heterogeneous Applications Tools ANSI Standard Compilers Fortran, C, C++: PGI Debugger: TotalView Performance Monitor: Cray Apprentice and PAPI System Management and Administration Accounting RAS GUI Interface for monitoring system Single System View
8 Red Storm System Management and RAS RAS Workstations: Cray CMS Separate and redundant RAS workstations for Red and Black ends of machine. System administration and monitoring interface. Error logging and monitoring for major system components including processors, memory, NIC/Router, power supplies, fans, disk controllers, and disks. RAS Network - Dedicated Ethernet network for connecting RAS nodes to RAS workstations. RAS Nodes One for each compute board - L0 One for each cabinet - L1
9 Red Storm Performance: Interconnect and I/O Interconnect performance MPI Latency: Requirement & measured Neighbor < 5 µs; measured: 6.0 generic / 3.6 accelerated Full machine < 8 µs; add ~ 3 µs to above Measured MPI Bandwidth ~ 2,200 MB/sec uni-directional, ~ 4,000 MB/sec bi-directional Peak HT bandwidth: 3.2 GB/s each direction Peak Link bandwidth: 3.84 GB/s each direction Bi-section bandwidth ~3.69 TB/s Y by Z; ~4.98 TB/s X by Z; ~8.30 TB/s X by Y (torus) I/O system performance PFS Requirement: 50 GB/s sustained for each color Observed 50 GB/s using IOR under ideal conditions External Requirement: 25 GB/s (aggregate) sustained for each color Observed 600 MB/s over a single 10GE link
10 System Availability OUO
11 System Reliability OUO
12 Application Performance Post-Upgrade Preliminary Analysis
13 Upgrade Performance Before & After Upgrade by the Numbers 40 to 125 TeraFLOPS 10,368 to 12,960 compute nodes 512 to 640 service nodes 2X the SeaStar Interconnect Bandwidth 2.0 GHz single-core to 2.4 GHz dual-core AMD Opteron processors DDR-333 to DDR-400 memory speed Status Black Section is in progress Center Section - Oct 06 Red Section - Nov 06
14 Upgrade Performance Single-Core vs Dual-Core CTH - Shape Charge Constant work/core Speedup of at least 1.4 out to 2048 sockets Sage - timing_c Constant work/core At scale, speedup is at least a factor of 1.6
15 Application Performance Pre-Upgrade Preliminary Analysis
16 Sage Scaling (John Daly, LANL) Updated data point
17 CTH ASC Purple & Red Storm Performance Sandia's CTH(Shape Charge- 90x216x90 cells/pe) Execution Time for 100 Cycles 2500 Wall Time, secs CTH-Purple CTH-Red Storm Number Of Processors
18 SEAM Benchmarks (aqua planet) Red Storm 5 TF max BG/L 4 TF max SEAM = NCAR s Spectral Element Atmospheric Model, POP = LANL s Parallel Ocean Program
19 POP Benchmarks ( 1/10 degree Ocean)
20 The Impact of a Balanced Architecture Architectural balance with low system noise is the key to a scalable platform Well Balanced Traits Translate to High Real World Application Performance
21 Conclusions Red Storm is an architecture Red Storm is an instantiation of that architecture Red Storm has demonstrated excellent scalability on real applications The upgrade has shown significant application speedup (more analysis to come)
Initial Performance Evaluation of the Cray SeaStar Interconnect
Initial Performance Evaluation of the Cray SeaStar Interconnect Ron Brightwell Kevin Pedretti Keith Underwood Sandia National Laboratories Scalable Computing Systems Department 13 th IEEE Symposium on
More informationCray RS Programming Environment
Cray RS Programming Environment Gail Alverson Cray Inc. Cray Proprietary Red Storm Red Storm is a supercomputer system leveraging over 10,000 AMD Opteron processors connected by an innovative high speed,
More informationRed Storm / Cray XT4: A Superior Architecture for Scalability
Red Storm / Cray XT4: A Superior Architecture for Scalability Mahesh Rajan, Doug Doerfler, Courtenay Vaughan Sandia National Laboratories, Albuquerque, NM Cray User Group Atlanta, GA; May 4-9, 2009 Sandia
More informationParallel Computer Architecture II
Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de
More informationEldorado. Outline. John Feo. Cray Inc. Why multithreaded architectures. The Cray Eldorado. Programming environment.
Eldorado John Feo Cray Inc Outline Why multithreaded architectures The Cray Eldorado Programming environment Program examples 2 1 Overview Eldorado is a peak in the North Cascades. Internal Cray project
More informationThe Cielo Capability Supercomputer
The Cielo Capability Supercomputer Manuel Vigil (LANL), Douglas Doerfler (SNL), Sudip Dosanjh (SNL) and John Morrison (LANL) SAND 2011-3421C Unlimited Release Printed February, 2011 Sandia is a multiprogram
More informationSami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1
Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var
More informationTitan - Early Experience with the Titan System at Oak Ridge National Laboratory
Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history
More informationCray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET
CRAY XD1 DATASHEET Cray XD1 Supercomputer Release 1.3 Purpose-built for HPC delivers exceptional application performance Affordable power designed for a broad range of HPC workloads and budgets Linux,
More informationWednesday : Basic Overview. Thursday : Optimization
Cray Inc. Wednesday : Basic Overview XT Architecture XT Programming Environment XT MPT : CRAY MPI Cray Scientific Libraries CRAYPAT : Basic HOWTO Handons Thursday : Optimization Where and How to Optimize
More informationScalable Computing at Work
CRAY XT4 DATASHEET Scalable Computing at Work Cray XT4 Supercomputer Introducing the latest generation massively parallel processor (MPP) system from Cray the Cray XT4 supercomputer. Building on the success
More informationIntroducing the next generation of affordable and productive massively parallel processing (MPP) computing the Cray XE6m supercomputer.
Introducing the next generation of affordable and productive massively parallel processing (MPP) computing the Cray XE6m supercomputer. Building on the reliability and scalability of the Cray XE6 supercomputer
More informationStockholm Brain Institute Blue Gene/L
Stockholm Brain Institute Blue Gene/L 1 Stockholm Brain Institute Blue Gene/L 2 IBM Systems & Technology Group and IBM Research IBM Blue Gene /P - An Overview of a Petaflop Capable System Carl G. Tengwall
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationWhat are Clusters? Why Clusters? - a Short History
What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by
More informationZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1
ZEST Snapshot Service A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1 Design Motivation To optimize science utilization of the machine Maximize
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationThe Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter
The Hopper System: How the Largest* XE6 in the World Went From Requirements to Reality! Katie Antypas, Tina Butler, and Jonathan Carter CUG 2011, May 25th, 2011 1 Requirements to Reality Develop RFP Select
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short
More informationUse of Common Technologies between XT and Black Widow
Use of Common Technologies between XT and Black Widow CUG 2006 This Presentation May Contain Some Preliminary Information, Subject To Change Agenda System Architecture Directions Software Development and
More informationUser Training Cray XC40 IITM, Pune
User Training Cray XC40 IITM, Pune Sudhakar Yerneni, Raviteja K, Nachiket Manapragada, etc. 1 Cray XC40 Architecture & Packaging 3 Cray XC Series Building Blocks XC40 System Compute Blade 4 Compute Nodes
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationOur Workshop Environment
Our Workshop Environment John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2015 Our Environment Today Your laptops or workstations: only used for portal access Blue Waters
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More informationThe way toward peta-flops
The way toward peta-flops ISC-2011 Dr. Pierre Lagier Chief Technology Officer Fujitsu Systems Europe Where things started from DESIGN CONCEPTS 2 New challenges and requirements! Optimal sustained flops
More informationBatch Scheduling on XT3
Batch Scheduling on XT3 Chad Vizino Pittsburgh Supercomputing Center Overview Simon Scheduler Design Features XT3 Scheduling at PSC Past Present Future Back to the Future! Scheduler Design
More informationOutline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers
Outline Execution Environments for Parallel Applications Master CANS 2007/2008 Departament d Arquitectura de Computadors Universitat Politècnica de Catalunya Supercomputers OS abstractions Extended OS
More information2008 International ANSYS Conference
2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationPortable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.
Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences
More informationThe IBM Blue Gene/Q: Application performance, scalability and optimisation
The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,
More informationMellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007
Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise
More informationCluster Network Products
Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster
More informationRoadmapping of HPC interconnects
Roadmapping of HPC interconnects MIT Microphotonics Center, Fall Meeting Nov. 21, 2008 Alan Benner, bennera@us.ibm.com Outline Top500 Systems, Nov. 2008 - Review of most recent list & implications on interconnect
More informationThe Cray XD1. Technical Overview. Amar Shan, Senior Product Marketing Manager. Cray XD1. Cray Proprietary
The Cray XD1 Cray XD1 Technical Overview Amar Shan, Senior Product Marketing Manager Cray Proprietary The Cray XD1 Cray XD1 Built for price performance 30 times interconnect performance 2 times the density
More informationCurrent Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN
Current Status of the Next- Generation Supercomputer in Japan YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN International Workshop on Peta-Scale Computing Programming Environment, Languages
More informationIllinois Proposal Considerations Greg Bauer
- 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and
More informationComputer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research
Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya
More informationRegression Testing on Petaflop Computational Resources. CUG 2010, Edinburgh Mike McCarty Software Developer May 27, 2010
Regression Testing on Petaflop Computational Resources CUG 2010, Edinburgh Mike McCarty Software Developer May 27, 2010 Additional Authors Troy Baer (NICS) Lonnie Crosby (NICS) Outline What is NICS and
More informationThe Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center
The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.
More informationDesigning High Performance Communication Middleware with Emerging Multi-core Architectures
Designing High Performance Communication Middleware with Emerging Multi-core Architectures Dhabaleswar K. (DK) Panda Department of Computer Science and Engg. The Ohio State University E-mail: panda@cse.ohio-state.edu
More informationCray events. ! Cray User Group (CUG): ! Cray Technical Workshop Europe:
Cray events! Cray User Group (CUG):! When: May 16-19, 2005! Where: Albuquerque, New Mexico - USA! Registration: reserved to CUG members! Web site: http://www.cug.org! Cray Technical Workshop Europe:! When:
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationResource allocation and utilization in the Blue Gene/L supercomputer
Resource allocation and utilization in the Blue Gene/L supercomputer Tamar Domany, Y Aridor, O Goldshmidt, Y Kliteynik, EShmueli, U Silbershtein IBM Labs in Haifa Agenda Blue Gene/L Background Blue Gene/L
More informationSugon TC6600 blade server
Sugon TC6600 blade server The converged-architecture blade server The TC6600 is a new generation, multi-node and high density blade server with shared power, cooling, networking and management infrastructure
More informationBreakthrough Science via Extreme Scalability. Greg Clifford Segment Manager, Cray Inc.
Breakthrough Science via Extreme Scalability Greg Clifford Segment Manager, Cray Inc. clifford@cray.com Cray s focus The requirement for highly scalable systems Cray XE6 technology The path to Exascale
More informationResources Current and Future Systems. Timothy H. Kaiser, Ph.D.
Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic
More informationResources Current and Future Systems. Timothy H. Kaiser, Ph.D.
Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic
More informationApplication Sensitivity to Link and Injection Bandwidth on a Cray XT4 System
Application Sensitivity to Link and Injection Bandwidth on a Cray XT4 System Cray User Group Conference Helsinki, Finland May 8, 28 Kevin Pedretti, Brian Barrett, Scott Hemmert, and Courtenay Vaughan Sandia
More informationTFLOP Performance for ANSYS Mechanical
TFLOP Performance for ANSYS Mechanical Dr. Herbert Güttler Engineering GmbH Holunderweg 8 89182 Bernstadt www.microconsult-engineering.de Engineering H. Güttler 19.06.2013 Seite 1 May 2009, Ansys12, 512
More informationComputer Architecture
Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationUsing Quality of Service for Scheduling on Cray XT Systems
Using Quality of Service for Scheduling on Cray XT Systems Troy Baer HPC System Administrator National Institute for Computational Sciences, University of Tennessee Outline Introduction Scheduling Cray
More informationEN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University
EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to
More informationPerformance of Variant Memory Configurations for Cray XT Systems
Performance of Variant Memory Configurations for Cray XT Systems presented by Wayne Joubert Motivation Design trends are leading to non-power of 2 core counts for multicore processors, due to layout constraints
More informationIan Foster, An Overview of Distributed Systems
The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,
More informationOverlapping Computation and Communication for Advection on Hybrid Parallel Computers
Overlapping Computation and Communication for Advection on Hybrid Parallel Computers James B White III (Trey) trey@ucar.edu National Center for Atmospheric Research Jack Dongarra dongarra@eecs.utk.edu
More informationCS500 SMARTER CLUSTER SUPERCOMPUTERS
CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer
More informationOncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries
Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big
More informationHigh Performance Computing: Blue-Gene and Road Runner. Ravi Patel
High Performance Computing: Blue-Gene and Road Runner Ravi Patel 1 HPC General Information 2 HPC Considerations Criterion Performance Speed Power Scalability Number of nodes Latency bottlenecks Reliability
More informationPerformance and Power Co-Design of Exascale Systems and Applications
Performance and Power Co-Design of Exascale Systems and Applications Adolfy Hoisie Work with Kevin Barker, Darren Kerbyson, Abhinav Vishnu Performance and Architecture Lab (PAL) Pacific Northwest National
More informationProductive Performance on the Cray XK System Using OpenACC Compilers and Tools
Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid
More informationBrutus. Above and beyond Hreidar and Gonzales
Brutus Above and beyond Hreidar and Gonzales Dr. Olivier Byrde Head of HPC Group, IT Services, ETH Zurich Teodoro Brasacchio HPC Group, IT Services, ETH Zurich 1 Outline High-performance computing at ETH
More informationParallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein
Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster
More informationComparing Linux Clusters for the Community Climate System Model
Comparing Linux Clusters for the Community Climate System Model Matthew Woitaszek, Michael Oberg, and Henry M. Tufo Department of Computer Science University of Colorado, Boulder {matthew.woitaszek, michael.oberg}@colorado.edu,
More informationApplication Performance on Dual Processor Cluster Nodes
Application Performance on Dual Processor Cluster Nodes by Kent Milfeld milfeld@tacc.utexas.edu edu Avijit Purkayastha, Kent Milfeld, Chona Guiang, Jay Boisseau TEXAS ADVANCED COMPUTING CENTER Thanks Newisys
More informationSmarter Clusters from the Supercomputer Experts
Smarter Clusters from the Supercomputer Experts Maximize Your Results with Flexible, High-Performance Cray CS500 Cluster Supercomputers In science and business, as soon as one question is answered another
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationInfiniband and RDMA Technology. Doug Ledford
Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic
More informationDesigned for Maximum Accelerator Performance
Designed for Maximum Accelerator Performance A dense, GPU-accelerated cluster supercomputer that delivers up to 329 double-precision GPU teraflops in one rack. This power- and spaceefficient system can
More informationOptimizing LS-DYNA Productivity in Cluster Environments
10 th International LS-DYNA Users Conference Computing Technology Optimizing LS-DYNA Productivity in Cluster Environments Gilad Shainer and Swati Kher Mellanox Technologies Abstract Increasing demand for
More informationThe Cray Rainier System: Integrated Scalar/Vector Computing
THE SUPERCOMPUTER COMPANY The Cray Rainier System: Integrated Scalar/Vector Computing Per Nyberg 11 th ECMWF Workshop on HPC in Meteorology Topics Current Product Overview Cray Technology Strengths Rainier
More informationEARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA
EARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA SUDHEER CHUNDURI, SCOTT PARKER, KEVIN HARMS, VITALI MOROZOV, CHRIS KNIGHT, KALYAN KUMARAN Performance Engineering Group Argonne Leadership Computing Facility
More informationIntroduction of Fujitsu s next-generation supercomputer
Introduction of Fujitsu s next-generation supercomputer MATSUMOTO Takayuki July 16, 2014 HPC Platform Solutions Fujitsu has a long history of supercomputing over 30 years Technologies and experience of
More informationScaling Across the Supercomputer Performance Spectrum
Scaling Across the Supercomputer Performance Spectrum Cray s XC40 system leverages the combined advantages of next-generation Aries interconnect and Dragonfly network topology, Intel Xeon processors, integrated
More informationThe Architecture and the Application Performance of the Earth Simulator
The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator
More informationIntel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins
Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications
More informationHIGH PERFORMANCE COMPUTING FROM SUN
HIGH PERFORMANCE COMPUTING FROM SUN Update for IDC HPC User Forum, Norfolk, VA April 2008 Bjorn Andersson Director, HPC and Integrated Systems Sun Microsystems Sun Constellation System Integrating the
More informationToward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies
Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationThe AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA
The AMD64 Technology for Server and Workstation Dr. Ulrich Knechtel Enterprise Program Manager EMEA Agenda Direct Connect Architecture AMD Opteron TM Processor Roadmap Competition OEM support The AMD64
More informationMaximizing Memory Performance for ANSYS Simulations
Maximizing Memory Performance for ANSYS Simulations By Alex Pickard, 2018-11-19 Memory or RAM is an important aspect of configuring computers for high performance computing (HPC) simulation work. The performance
More informationOverview of Tianhe-2
Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn
More informationCASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster
CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster Mr. Matthew Krzych Naval Undersea Warfare Center Phone: 401-832-8174 Email Address: krzychmj@npt.nuwc.navy.mil The Robust Passive Sonar
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationCray XT Series System Overview S
Cray XT Series System Overview S 2423 20 2004 2007 Cray Inc. All Rights Reserved. This manual or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of
More informationAssessment of LS-DYNA Scalability Performance on Cray XD1
5 th European LS-DYNA Users Conference Computing Technology (2) Assessment of LS-DYNA Scalability Performance on Cray Author: Ting-Ting Zhu, Cray Inc. Correspondence: Telephone: 651-65-987 Fax: 651-65-9123
More informationHYCOM Performance Benchmark and Profiling
HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities
More informationParallel File Systems Compared
Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features
More informationDesign and Evaluation of a 2048 Core Cluster System
Design and Evaluation of a 2048 Core Cluster System, Torsten Höfler, Torsten Mehlan and Wolfgang Rehm Computer Architecture Group Department of Computer Science Chemnitz University of Technology December
More informationEnabling Performance-per-Watt Gains in High-Performance Cluster Computing
WHITE PAPER Appro Xtreme-X Supercomputer with the Intel Xeon Processor E5-2600 Product Family Enabling Performance-per-Watt Gains in High-Performance Cluster Computing Appro Xtreme-X Supercomputer with
More informationDetermining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace
Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace James Southern, Jim Tuccillo SGI 25 October 2016 0 Motivation Trend in HPC continues to be towards more
More informationSix-Core AMD Opteron Processor
What s you should know about the Six-Core AMD Opteron Processor (Codenamed Istanbul ) Six-Core AMD Opteron Processor Versatility Six-Core Opteron processors offer an optimal mix of performance, energy
More informationPART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System
INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE
More informationCluster Computing. Cluster Architectures
Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:
More informationHPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017
Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle
More informationWhat does Heterogeneity bring?
What does Heterogeneity bring? Ken Koch Scientific Advisor, CCS-DO, LANL LACSI 2006 Conference October 18, 2006 Some Terminology Homogeneous Of the same or similar nature or kind Uniform in structure or
More informationPerformance of Variant Memory Configurations for Cray XT Systems
Performance of Variant Memory Configurations for Cray XT Systems Wayne Joubert, Oak Ridge National Laboratory ABSTRACT: In late 29 NICS will upgrade its 832 socket Cray XT from Barcelona (4 cores/socket)
More information