Appro Supercomputer Solutions Appro and Tsukuba University

Size: px
Start display at page:

Download "Appro Supercomputer Solutions Appro and Tsukuba University"

Transcription

1 Appro Supercomputer Solutions Appro and Tsukuba University Accelerator Cluster Collaboration Steven Lyness, VP HPC Solutions Engineering

2 About Appro Over 20 Years of Experience OEM Server Manufacturer Branded Servers Clusters Solutions Manufacturer 2007 to 2012 End-To-End Supercomputer Solutions Moving Forward. 2

3 Appro on Top 500 Over 2 PFLOPs (peak) with just five Top100 systems added in to Top500 in November Variety of technologies: Intel, AMD, NVIDIA Multiple server form factors Infiniband and GigE Fat Tree and 3D Torus Excellent Linpack efficiency on nonoptimized SB systems 85.5% Fat Tree 83% - 85% 3D Torus 3

4 Appro Milestones Installations in 2012 Site Peak Performance Los Alamos (LANL) Sandia (SNL) Livermore (LLNL) Japan (Tsukuba, Kyoto) > 1.8 PFLOPs > 1.2 PFLOPs > 1.5 PFLOPs > 1 PFLOPs

5 About University Of Tsukuba HA-PACS Project HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) Apr Mar. 2014, 3-year project Project Leader: Prof. M. Sato (Director, CCS, Univ. of Tsukuba) Develop next generation GPU system : 15 members Project Office for Exascale Computing System Development (Leader: Prof. T. Boku) GPU cluster based on Tightly Coupled Accelerators architecture Develop large scale GPU applications : 15 members Project Office for Exascale Computational Sciences (Leader: Prof. M. Umemura) Elementary Particle Physics, Astrophysics, Bioscience, Nuclear/Quantum Physics, Global Environmental Science, High Performance Computing 5

6 University of Tsukuba- HA-PACS Project :: Problem Definition Many technology discussions to determine KEY : Fixed budget High Availability Latest Processor / High Flops 1:2 CPU:Accelerator Ratio High Bandwidth to the Accelerator High bandwidth, low latency interconnect Apps Could take advantage of more than QDR IB High IO Bandwidth to storage Easy to Manage 2012 GTC Conference 6

7 Solution Keys Fixed Budget Considerations Need to find a balance between: Performance - Flops, bandwidth (memory, IO Capacity (CPU Qty, GPU Qty, Memory per core, IO, Storage) Availability Features Ease of Management / Supportability Architecture needed: High Availability Nodes (PS, Fans) IPC networks (Ex. InfiniBand) Service Networks (Provisioning and Management) 2012 GTC Conference 7

8 Meeting Key Requirements Challenge: Create a Solution with High Availability Redundant power supplies Redundant hot swap fan trays Redundant Hot swap disk drives Redundant Networks Solution: Appro Xtreme-X Supercomputer, flagship product-line using GreenBlade sub-rack component used for for the DoE TLCC2 project Expand to add support for new custom blade nodes What Appro Brings to NWS 8

9 Solution Architecture :: Appro Xtreme-X Supercomputer Unified scalable cluster architecture that can be provisioned and managed as a stand-alone supercomputer. Improved power & cooling efficiency to dramatically lower total cost of ownership Offers high performance and high availability features with lower latency and higher bandwidth. Appro HPC Software Stack - Complete HPC Cluster Software tools combined with the Appro Cluster Engine (ACE) Management Software including the following capabilities: System Management Network Management Server Management 2012 GTC Conference Cluster Management Storage Management 9

10 Meeting Key Requirements Optimal Performance Peak Performance CPU Contribution Sandy Bridge-EP 2.6 GHz E Processor (332 GFlops per node) GPU Contribution 665 GFlops per NVIDIA S2090 Four (4) S2090 s per node or 2.66 TFlops per node Combined Peak Performance is 3 TFlops per node Two Hundred and Sixty-Eight (268) nodes provides 802 TFlops Accelerator Performance DEDICATED PCI-e Gen3 X16 for each NVIDIA GPU Uses Gen2 so we have up to 8 GB/s per GPU available IO Performance 2 x QDR (Mellanox CX3) Up to 4GB/s per link (on PCI-e Gen3 X8) bus GigE for Operations networks Presentation Name 10

11 Appro GreenBlade Sub-Rack With Accelerator Expansion Blades Up to 4x 2P GB812X blades Expandability for HDD, SSD, GPU, MIC Six Cooling Fan Units Hot swappable & redundant Up to six 1600W power supplies Platinum-rated; 95%+ efficient Hot swappable & redundant Support one or redundant iscb platform manager modules with Enhanced management capabilities Active & dynamic fan control Power monitoring Remote power control Integrated console server Appro Confidential and Proprietary

12 Appro GreenBlade Subrack iscb Modules Server Board Increased memory footprint (2 DPC) Provides access to two (2) PCI-e Gen3 X16 PER SOCKET Provides for increased IO capability QDR or FDR InfiniBand on the motherboard Internal RAID Adapter on Gen3 bus Up to two (2) 2.5 Hard drives NOTE: Can run diskless/stateless because of Appro Cluster Engine but needed local scratch 2012 GTC Conference Appro Confidential and Proprietary

13 :: Server Node Design Challenge: Create a server node with Latest Generation of processors: Need for flops AND IO capacity HIGH bandwidth to the Accelerators High Memory capacity Solution: Meeting Key Requirements High Bandwidth Intel Sandy Bridge-EP for CPU and the NVIDIA Tesla for GPU Working with Intel EPSD EARLY on to design a motherboard Washington Pass (S2600WP) Motherboard with: Dual Sandy Bridge-EP (E5-2700) sockets Expose four (4) PCI-e Gen3 X16 for Accelerator Connectivity Expose one (1) PCI-e Gen3 X8 for Expansion slot/io Two (2) DIMMS Per channel (16 DIMMS total) 2U form factor for fit and air flow/cooling 2012 GTC Conference 13

14 Meeting Key Requirements Intel EPSD S2600WP Motherboard 4 Channels 1,600 MHz 51.2 GB/sec 4 Channels 1,600 MHz 51.2 GB/sec DDR3 DDR3 DDR3 DDR3 DDR3 DDR3 DDR3 DDR3 Sandy Bridge QPI Sandy Bridge Patsburg DMI SNB-EN QPI PCH EP EP PCI-e X4 Gen 3 x 8 Gen 3 x 16 Gen 3 x 16 Gen 3 x 16 Gen 3 x 16 Gen 3 x 8 Dual GbE ESI BMC BIOS 4 x NVIDIA M2090 2xQDR IB Dual GbE 2012 GTC Conference Appro Confidential and Proprietary

15 PAGE 15 GreenBlade Node Design HDD0 HDD1 GigE Cluster Management / Operations Network (Prime) QDR InfiniBand (Port 0) QDR InfiniBand (Port 1) GigE Cluster Management / Operations Network (Secondary) 2012 GTC Conference

16 Meeting Key Requirements :: Network Availability Challenge To provide cost effective redundant networks to eliminate/reduce failures (MTTI) Solution Build system with redundant operations Ethernet networks Redundant on-board GigE each with access to IPMI Redundant iscb Modules for baseboard management, node control and monitoring Build system with redundant InfiniBand networks DUAL QDR for price/performance Selected Mellanox due to Gen3 X8 support (dual port adapter) 2012 GTC Conference 16

17 Meeting Key Requirements :: Operations Networking GbE 10GigE Management Node(s) Login Node(s) External Network Sub Management Node (GreenBlade GB812X) 10GigE Switch Sub Management Node (GreenBlade GB812X) 48 port Leaf Switches Compute Nodes Rack (1), Rack (2) and Rack (3) Compute Nodes Rack (N-2), Rack (N-1) and Rack (N) 2012 GTC Conference 17

18 Meeting Key Requirements :: Ease of Use Challenge Need the System top install quickly to get into production Most have limited people resources Need to be able to keep the system running and doing science Solution Appro HPC Software Stack Tested and Validated Full stack from HW layer to Application layer Allows for quick bring up of a cluster 2012 GTC Conference 18

19 Appro HPC Software Stack Appro HPC Software Stack User Applications Performance Monitoring Compilers Message Passing Job Scheduling Storage Cluster Monitoring Remote Power Mgmt HPCC Perfctr IOR PAPI/IPM netperf Intel Cluster Studio MVAPICH2 Grid Engine NFS (3.x ACE PGI (PGI CDK) GNU PathScale OpenMPI Intel MPI-(Intel Cluster Studio) SLURM PBS Pro Local FS (ext3, ext4, XFS) PanFS Lustre ACE (iscb and OpenIPMI) PowerMan Console Mgmt ACE ConMan Provisioning OS Appro Cluster Engine (ACE ) Virtual Clusters Linux (Red Hat, CentOS, SuSE) Appro Xtreme-X Supercomputer Building Blocks Appro Turn-Key Integration & Delivery Services HW and SW integration, pre-acceptance testing, dismantle, packing and shipping Appro HPC Professional Services - On-site Installation services and/or Customized services 2012 GTC Conference

20 Appro Key Advantages :: Summary Partnering with Key technology partners to offer cutting-edge integrated solutions: Performance Storage IOR Networking Bandwidth, latencies and message rates Features High Availability (high standard MTBF, redundancy - PS) Ease of Management Flexibility Price /Performance Training Programs Pre-Sales (Sell everything it does and ONLY that) Installation and Tuning Post Install Support 2012 GTC Conference 20

21 Appro Xtreme-X Supercomputer :: Turn-Key Solution Summary Appro HPC Software Stack Appro Cluster Engine (ACE) Management Software Suite Appro Xtreme-X Supercomputer addressing 4 HPC Workload Configurations Capacity Computing Hybrid Computing Data Intensive Computing Capability Computing Turn-Key Integration & Delivery Services - Node, Rack, Switch, Interconnect, cable, network, storage, software, Burning-in - Pre-acceptance testing, performance validation, dismantle, packing and shipping Appro HPC Professional Services - On-site Installation services and/or Customized services Appro Corporate Presentation 21

22 Questions? Ask Now or see us at Table #54 Appro Supercomputer Solutions Steve Lyness, VP HPC Solutions Engineering Learn More at

23 HA-PACS Next Step for Scientific Frontier by Accelerated Computing Taisuke Boku Center for Computational Sciences University of Tsukuba GTC2012, San Jose 23

24 Project plan of HA-PACS 2 HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) Accelerating critical problems on various scientific fields in Center for Computational Sciences, University of Tsukuba The target application fields will be partially limited Current target: QCD, Astro, QM/MM (quantum mechanics / molecular mechanics, for life science) Two parts HA-PACS base cluster: for development of GPU-accelerated code GTC2012, San Jose for target fields, and performing productrun of them

25 GPU Computing: current trend of HPC GPU clusters in TOP500 on Nov nd 天河 Tienha-1A (Rpeak=4.70 PFLOPS) 4th 星雲 Nebulae (Rpeak=2.98 PFLOPS) 5th TSUBAME2.0 (Rpeak=2.29 PFLOPS) (1st K Computer Rpeak=11.28 PFLOPS) Features high peak performance / cost ratio high peak performance / power ratio large scale applications with GPU acceleration don t run yet in production on GPU cluster Our First target is to develop large scale applications accelerated by GPU in real computational sciences 25 GTC2012, San Jose

26 Problems of GPU Cluster 26 Problems of GPGPU for HPC Data I/O performance limitation Ex) GPGPU: PCIe gen2 x16 Peak Performance: 8GB/s (I/O) 665 GFLOPS (NVIDIA M2090) Memory size limitation Ex) M2090: 6GByte vs CPU: GByte Communication between accelerators: no direct path (external) communication latency via CPU becomes large Our another target is developing a direct communication system between external Ex) GPGPU: GPUs for a feasibility GPU mem study CPU for mem future (MPI) CPU accelerated computing mem GPU mem GTC2012, Researches San Jose for direct communication

27 Project Formation 27 GTC2012, San Jose HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) Apr Mar. 2014, 3-year project Project Leader: Prof. M. Sato (Director, CCS, Univ. of Tsukuba) Develop next generation GPU system : 15 members Project Office for Exascale Computing System Development (Leader: Prof. T. Boku) GPU cluster based on Tightly Coupled Accelerators architecture Develop large scale GPU applications : 15 members Project Office for Exascale Computational

28 HA-PACS base cluster (Feb. 2012) 2 GTC2012, San Jose

29 HA-PACS base cluster Front view Side view 2 GTC2012, San Jose

30 HA-PACS base cluster Front view of 3 blade chassis Rear view of one blade chassis with 4 blades Rear view of Infiniba and cables (yellow=fibre, black= 3 GTC2012, San Jose

31 HA-PACS: base cluster (computation node) AVX (2.6GHz x 8flop/clock) (16GB, 12.8GB =128GB, GFLOPSx16 =332.8GFLOPS Total: 3TFLOPS 665GFLOPSx4 =2660GFLOPS (6GB, 177GB/s)x4 =24GB, 708GB/s 31 8GB/s GTC2012, San Jose

32 HA-PACS: base cluster unit(cpu) Intel Xeon E5 (SandyBridge-EP) x 2 8 cores/socket (16 cores/node) with 2.6 GHz AVX (256-bit SIMD) on each core peak perf./socket = 2.6 x 4 x 2 = GFLOPS pek perf./node = GFLOPS Each socket supports up to 40 lanes of PCIe gen3 great performance to connect multiple GPUs without I/O performance bottleneck current NVIDIA M2090 supports just PCIe gen2, but net generation (Kepler) will support PCIe gen3 M2090 x4 can be connected to 2 SandyBridge-EP still remaining PCIe gen3 x8 x2 Infiniband QDR x 2 3 GTC2012, San Jose

33 HA-PACS: base cluster unit(gpu) NVIDIA M2090 x 4 Number of processor core: 512 Processor core clock: 1.3 GHz DP 665 GFLOPS, SP 1331GFLOPS PCI Express gen2 16 system interface Board power dissipation: <= 225 W Memory clock: 1.85 GHz, size: 6GB with ECC, 177GB/s Shared/L1 Cache: 64KB, L2 Cache: 768KB 33 GTC2012, San Jose

34 HA-PACS: base cluster unit(blade node) 2x NVIDIA Tesla M2090 1x PCIe slot for HCA 2x 2.6GHz 8core SandyBridge-EP Air flow 2x 2.5 HDD 2x NVIDIA Tesla M2090 Front view Rear view Power Supply Unit and Fan - 8U enclosure - 4 nodes - 3 PSU(Hot Swappable) - 6 Fans(Hot Swappable) 34 GTC2012, San Jose

35 Basic performance data MPI pingpong 6.4 GB/s (N 1/2 = 8KB) with dual rail Infiniband QDR (Mellanox ConnectX-3) actually FDR for HCA and QDR for switch PCIe benchmark (Device -> Host memory copy), aggregated perf. for 4 GPUs simultaneously 24 GB/s (N 1/2 = 20KB) PCIe gen2 x16 x4, theoretical peak = 8 GB/s x4 = 32 GB/s Stream (memory) 74.6 GB/s theoretical peak = GB/s 3 GTC2012, San Jose

36 PCIe Host:Device communication performance Slower start on Host->Device compared with De 3 GTC2012, San Jose

37 HA-PACS Application (1): Elementary Particle Physics Multi-scale physics Finite temperature and density Investigate hierarchical properties via direct construction of nuclei in lattice QCD GPU to solve large sparse linear systems of equations quark Phase analysis of QCD at finite temperature and density GPU to perform matrix-matrix product of dense matrices Expected QCD phase diagram proton neutron nucleus 37 GTC2012, San Jose

38 HA-PACS Applications (2): Astrophysics (A) Collisional N-body Simulation (B) Radiation Transfer Globular Clusters Formation of the most primordial objects formed more than 10 giga years. Fossil object as a clue to investigate the primordial universe Accretion Disks around Black H Massive Black Holes in Galaxies Understanding of the formation of massive black holes in galaxies Numerical simulations of complicated gravitational interactions between stars and multiple black holes in galaxy centers. First Stars and Re-ionization o Understanding of the formation of the first stars in the universe and the succeeded re-ionization of the universe. Study of the high temperature regions around black holes Calculation of the physical effects of photons emitted by stars and galaxies onto the surrounding matter. Direct (brute force) calculations of acceleration and jerks are required to achieve the required numerical accuracy Computations of the accelerations of particles and their time derivatives (jerks) are time consuming. Accelerations and jerks are computed on GPU So far, poorly investigated due to its huge amount of computational cost, though it is of critical importance in the formation of stars and galaxies. Computations of the radiation intensity and the resulting chemical reactions based on the ray-tracing methods can be highly accelerated with GPUs owing to its high concurrency. 38 GTC2012, San Jose

39 HA-PACS Application (3): Bioscience GPU acceleration DNA-protein complex (macroscale MD) - Direct coulmb (Gromacs, QM region NAMD, > 100 atoms Amber) -2 electron integral 39 GTC2012, San Jose Reaction mechanisms (QM/MM-MD)

40 HA-PACS Application (4) Other advanced researches on HPC Division in CCS XcalableMP-dev (XMP-dev) for easy and simple programming language to support distributed memory & GPU accelerated computing for large scale computational sciences G8 NuFuSE (Nuclear Fusion Simulation for Exascale) project platform for porting Plasma Simulation Code with GPU technology Climate simulation especially for LES (Large Eddy Simulation) for cloud-level resolution on city-model size simulation Any other collaboration... 4 GTC2012, San Jose

41 HA-PACS: TCA (Tightly Coupled Accelerator) 41 TCA: Tightly Coupled Accelerator Direct connection between accelerators (GPUs) Using PCIe as a communication device between accelerator Most acceleration device and other I/O device are connected by PCIe as PCIe end-point (slave device) An intelligent PCIe device logically enables an endpoint device to directly communicate with other endpoint devices PEARL: PCI Express Adaptive and Reliable Link We already developed such PCIe device (PEACH, PCI Express Adaptive Communication Hub) on JST-CREST project low power and dependable network for embedded system It enables direct connection between nodes by PCIe Gen2 x4 link GTC2012, Improving San Jose PEACH for HPC to realize TCA

42 PEACH 4 PEACH: PCI-Express Adaptive Communication Hub An intelligent PCI-Express communication switch to use PCIe link directly for node-to-node interconnection Edge of PEACH PCIe link can be connected to any peripheral devices, including GPU Prototype PEACH chip 4-port PCI-E gen.2 with x4 lane / port PCI-E link edge control feature: root GTC2012, San Jose complex and end points are automatically switched (flipped) according

43 HA-PACS/TCA (Tightly Coupled Accelerator) True GPU-direct IB Switc h current GPU clusters require 3-hop communication (3-5 times memory copy) For strong scaling, Inter- GPU direct communication protocol is needed for lower latency and higher throughput Node IB PCIe HC A PCIe CP U MEM MEM PCIe GPU MEM Enhanced version of PEACH PEACH2 x4 lanes -> x8 lanes hardwired on main data path and PCIe interface fabric CP U PCIe PCIe PEAC H2 Node PCIe GPU IB HC A PCIe CP U MEM MEM PCIe GPU MEM CP U PCIe PEAC H2 PCIe GPU 4 GTC2012, San Jose

44 Implementation of PEACH2: ASIC FPGA 4 FPGA based implementation today s advanced FPGA allows to use PCIe hub with multiple ports currently gen2 x 8 lanes x 4 ports are available soon gen3 will be available (?) easy modification and enhancement fits to standard (full-size) PCIe board internal multi-core general purpose CPU with programmability is available easily split hardwired/firmware partitioning on certain level on control layer Controlling PEACH2 for GPU GTC2012, communication San Jose protocol collaboration with NVIDIA for information

45 HA-PACS/TCA Node Cluster = NC Gx4 PEAC H2 C x 2 PEARL Ring Network Infiniband Link High speed GPU-GPU comm. by PEACH within NC (PCI-E gen2x8 = 5GB/s/link) Infiniband QDR (x2) for NC-NC comm. (4GB/s/link) 4 Gx4 PEAC H2 C x 2... Node Node Node Node Cluster Cluster Cluster Cluster Infiniband Network Gx4 PEAC H2 C x 2... Node Cluster with 16 nodes GPUx64 (G) CPUx32 (C) GPU comm with PCIe IB link / 4 NC with 16 no or 8 NC with 8 n = 360 TFLOPS e to base cluster Node Cluster node CPU: Xeon E5

46 Option 1: PEARL/PEACH2 variation (1) Performance comparison among IB and PEARL can be evenly compared Additional latency by PCIe switch C C C C G3 x16 GP U GP U QPI PCIe C C C C GP U G3 x16 GP U G3 x8 IB HC A G3 x8 PCIe SW PEA CH 2 G2 x8 4 GTC2012, San Jose

47 PEARL/PEACH2 variation (2) Option 2: Requires only 72 lanes in total asymmetric connection among 3 blocks of GPUs C C C C QPI C C C C G3 x16 G3 x16 GP U GP U PCIe G3 x16 GP U G3 x8 IB HC A PCIe G3 SW x16 GP U PE AC H2 G2 x8 4 GTC2012, San Jose

48 PEACH2 prototype board for TCA FPGA daughter board (Altera Stratix IV GX530) connector PCIe external link connector PCIe x2 edge connector (one more on (to host server) daughter board) GTC2012, San Jose 4 power regulato for FPGA

49 Summary 4 HA-PACS consists of two elements: HA- PACS base cluster for application development and HA-PACS/TCA for elementary study for advanced technology on direct communication among accelerating devices (GPUs) HA-PACS base cluster started its operation from Feb with 802 TFLOPS peak performance (Linpack performance will come on June 2012, also expecting good score on Green500) FPGA implementation of PEACH2 is GTC2012, finished San Jose for the prototype version on

HA-PACS Project Challenge for Next Step of Accelerating Computing

HA-PACS Project Challenge for Next Step of Accelerating Computing HA-PAS Project hallenge for Next Step of Accelerating omputing Taisuke Boku enter for omputational Sciences University of Tsukuba taisuke@cs.tsukuba.ac.jp 1 Outline of talk Introduction of S, U. Tsukuba

More information

T2K & HA-PACS Projects Supercomputers at CCS

T2K & HA-PACS Projects Supercomputers at CCS T2K & HA-PACS Projects Supercomputers at CCS Taisuke Boku Deputy Director, HPC Division Center for Computational Sciences University of Tsukuba Two Streams of Supercomputers at CCS Service oriented general

More information

HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs

HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs Yuetsu Kodama Division of High Performance Computing Systems Center for Computational Sciences University of Tsukuba,

More information

Tightly Coupled Accelerators Architecture

Tightly Coupled Accelerators Architecture Tightly Coupled Accelerators Architecture Yuetsu Kodama Division of High Performance Computing Systems Center for Computational Sciences University of Tsukuba, Japan 1 What is Tightly Coupled Accelerators

More information

Interconnection Network for Tightly Coupled Accelerators Architecture

Interconnection Network for Tightly Coupled Accelerators Architecture Interconnection Network for Tightly Coupled Accelerators Architecture Toshihiro Hanawa, Yuetsu Kodama, Taisuke Boku, Mitsuhisa Sato Center for Computational Sciences University of Tsukuba, Japan 1 What

More information

Enabling Performance-per-Watt Gains in High-Performance Cluster Computing

Enabling Performance-per-Watt Gains in High-Performance Cluster Computing WHITE PAPER Appro Xtreme-X Supercomputer with the Intel Xeon Processor E5-2600 Product Family Enabling Performance-per-Watt Gains in High-Performance Cluster Computing Appro Xtreme-X Supercomputer with

More information

Designed for Maximum Accelerator Performance

Designed for Maximum Accelerator Performance Designed for Maximum Accelerator Performance A dense, GPU-accelerated cluster supercomputer that delivers up to 329 double-precision GPU teraflops in one rack. This power- and spaceefficient system can

More information

CS500 SMARTER CLUSTER SUPERCOMPUTERS

CS500 SMARTER CLUSTER SUPERCOMPUTERS CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer

More information

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015 LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA

More information

GROMACS (GPU) Performance Benchmark and Profiling. February 2016

GROMACS (GPU) Performance Benchmark and Profiling. February 2016 GROMACS (GPU) Performance Benchmark and Profiling February 2016 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Mellanox, NVIDIA Compute

More information

STAR-CCM+ Performance Benchmark and Profiling. July 2014

STAR-CCM+ Performance Benchmark and Profiling. July 2014 STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute

More information

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances Solution Brief Intel Select Solution for Professional Visualization Intel Xeon Processor Scalable Family Powered by Intel Rendering Framework Intel Select Solutions for Professional Visualization with

More information

MILC Performance Benchmark and Profiling. April 2013

MILC Performance Benchmark and Profiling. April 2013 MILC Performance Benchmark and Profiling April 2013 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting

More information

NAMD Performance Benchmark and Profiling. January 2015

NAMD Performance Benchmark and Profiling. January 2015 NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

Inspur AI Computing Platform

Inspur AI Computing Platform Inspur Server Inspur AI Computing Platform 3 Server NF5280M4 (2CPU + 3 ) 4 Server NF5280M5 (2 CPU + 4 ) Node (2U 4 Only) 8 Server NF5288M5 (2 CPU + 8 ) 16 Server SR BOX (16 P40 Only) Server target market

More information

Smarter Clusters from the Supercomputer Experts

Smarter Clusters from the Supercomputer Experts Smarter Clusters from the Supercomputer Experts Maximize Your Results with Flexible, High-Performance Cray CS500 Cluster Supercomputers In science and business, as soon as one question is answered another

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 >

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 > Agenda Sun s x86 1. Sun s x86 Strategy 2. Sun s x86 Product Portfolio 3. Virtualization < 1 > 1. SUN s x86 Strategy Customer Challenges Power and cooling constraints are very real issues Energy costs are

More information

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 Power Systems AC922 Overview Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017 IBM POWER HPC Platform Strategy High-performance computer and high-performance

More information

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine April 2007 Part No 820-1270-11 Revision 1.1, 4/18/07

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

SNAP Performance Benchmark and Profiling. April 2014

SNAP Performance Benchmark and Profiling. April 2014 SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012 ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information

More information

Sugon TC6600 blade server

Sugon TC6600 blade server Sugon TC6600 blade server The converged-architecture blade server The TC6600 is a new generation, multi-node and high density blade server with shared power, cooling, networking and management infrastructure

More information

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015 Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute

More information

HP GTC Presentation May 2012

HP GTC Presentation May 2012 HP GTC Presentation May 2012 Today s Agenda: HP s Purpose-Built SL Server Line Desktop GPU Computing Revolution with HP s Z Workstations Hyperscale the new frontier for HPC New HPC customer requirements

More information

Analyzing Performance and Power of Applications on GPUs with Dell 12G Platforms. Dr. Jeffrey Layton Enterprise Technologist HPC

Analyzing Performance and Power of Applications on GPUs with Dell 12G Platforms. Dr. Jeffrey Layton Enterprise Technologist HPC Analyzing Performance and Power of Applications on GPUs with Dell 12G Platforms Dr. Jeffrey Layton Enterprise Technologist HPC Why GPUs? GPUs have very high peak compute capability! 6-9X CPU Challenges

More information

ABySS Performance Benchmark and Profiling. May 2010

ABySS Performance Benchmark and Profiling. May 2010 ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

HPC Hardware Overview

HPC Hardware Overview HPC Hardware Overview John Lockman III April 19, 2013 Texas Advanced Computing Center The University of Texas at Austin Outline Lonestar Dell blade-based system InfiniBand ( QDR) Intel Processors Longhorn

More information

GROMACS Performance Benchmark and Profiling. September 2012

GROMACS Performance Benchmark and Profiling. September 2012 GROMACS Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource

More information

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY John Fragalla Presenter s Name Title and Division Sun Microsystems Principle Engineer High Performance

More information

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. System Design of Kepler Based HPC Solutions Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering. Introduction The System Level View K20 GPU is a powerful parallel processor! K20 has

More information

Pedraforca: a First ARM + GPU Cluster for HPC

Pedraforca: a First ARM + GPU Cluster for HPC www.bsc.es Pedraforca: a First ARM + GPU Cluster for HPC Nikola Puzovic, Alex Ramirez We ve hit the power wall ALL computers are limited by power consumption Energy-efficient approaches Multi-core Fujitsu

More information

CPMD Performance Benchmark and Profiling. February 2014

CPMD Performance Benchmark and Profiling. February 2014 CPMD Performance Benchmark and Profiling February 2014 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting

More information

MAHA. - Supercomputing System for Bioinformatics

MAHA. - Supercomputing System for Bioinformatics MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

n N c CIni.o ewsrg.au

n N c CIni.o ewsrg.au @NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

LS-DYNA Performance Benchmark and Profiling. October 2017

LS-DYNA Performance Benchmark and Profiling. October 2017 LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource

More information

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server White Paper Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server Executive Summary This document describes the network I/O performance characteristics of the Cisco UCS S3260 Storage

More information

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia

More information

AMBER 11 Performance Benchmark and Profiling. July 2011

AMBER 11 Performance Benchmark and Profiling. July 2011 AMBER 11 Performance Benchmark and Profiling July 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

OpenFOAM Performance Testing and Profiling. October 2017

OpenFOAM Performance Testing and Profiling. October 2017 OpenFOAM Performance Testing and Profiling October 2017 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Huawei, Mellanox Compute resource - HPC

More information

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director CAS 2K13 Sept. 2013 Jean-Pierre Panziera Chief Technology Director 1 personal note 2 Complete solutions for Extreme Computing b ubullx ssupercomputer u p e r c o p u t e r suite s u e Production ready

More information

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.

More information

Density Optimized System Enabling Next-Gen Performance

Density Optimized System Enabling Next-Gen Performance Product brief High Performance Computing (HPC) and Hyper-Converged Infrastructure (HCI) Intel Server Board S2600BP Product Family Featuring the Intel Xeon Processor Scalable Family Density Optimized System

More information

IBM Power AC922 Server

IBM Power AC922 Server IBM Power AC922 Server The Best Server for Enterprise AI Highlights More accuracy - GPUs access system RAM for larger models Faster insights - significant deep learning speedups Rapid deployment - integrated

More information

HOKUSAI System. Figure 0-1 System diagram

HOKUSAI System. Figure 0-1 System diagram HOKUSAI System October 11, 2017 Information Systems Division, RIKEN 1.1 System Overview The HOKUSAI system consists of the following key components: - Massively Parallel Computer(GWMPC,BWMPC) - Application

More information

Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications

Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 Gilad Shainer, Tong Liu (Mellanox); Jeffrey Layton (Dell); Joshua Mora (AMD) High Performance Interconnects for

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

OCTOPUS Performance Benchmark and Profiling. June 2015

OCTOPUS Performance Benchmark and Profiling. June 2015 OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the

More information

About Us. Are you ready for headache-free HPC? Call us and learn more about our custom clustering solutions.

About Us. Are you ready for headache-free HPC? Call us and learn more about our custom clustering solutions. About Us Advanced Clustering Technologies customizes solutions to meet your exact specifications. For more information regarding our services, email us at sales@advancedclustering.com Advanced Clustering

More information

LS-DYNA Performance Benchmark and Profiling. April 2015

LS-DYNA Performance Benchmark and Profiling. April 2015 LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

Timothy Lanfear, NVIDIA HPC

Timothy Lanfear, NVIDIA HPC GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision

More information

Accelerating high-performance computing with hybrid platforms

Accelerating high-performance computing with hybrid platforms Accelerating high-performance computing with hybrid platforms October 2010 Dell THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE

More information

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET

Cray XD1 Supercomputer Release 1.3 CRAY XD1 DATASHEET CRAY XD1 DATASHEET Cray XD1 Supercomputer Release 1.3 Purpose-built for HPC delivers exceptional application performance Affordable power designed for a broad range of HPC workloads and budgets Linux,

More information

eslim SV Xeon 2U Server

eslim SV Xeon 2U Server eslim SV7-2250 Xeon 2U Server www.eslim.co.kr Dual and Quad-Core Server Computing Leader!! ELSIM KOREA INC. 1. Overview Hyper-Threading eslim SV7-2250 Server Outstanding computing powered by 64-bit Intel

More information

LS-DYNA Performance Benchmark and Profiling. October 2017

LS-DYNA Performance Benchmark and Profiling. October 2017 LS-DYNA Performance Benchmark and Profiling October 2017 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: LSTC, Huawei, Mellanox Compute resource

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

Cisco HyperFlex HX220c M4 Node

Cisco HyperFlex HX220c M4 Node Data Sheet Cisco HyperFlex HX220c M4 Node A New Generation of Hyperconverged Systems To keep pace with the market, you need systems that support rapid, agile development processes. Cisco HyperFlex Systems

More information

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè ARM64 and GPGPU 1 E4 Computer Engineering Company E4 Computer Engineering S.p.A. specializes in the manufacturing of high performance IT systems of medium

More information

Architecting High Performance Computing Systems for Fault Tolerance and Reliability

Architecting High Performance Computing Systems for Fault Tolerance and Reliability Architecting High Performance Computing Systems for Fault Tolerance and Reliability Blake T. Gonzales HPC Computer Scientist Dell Advanced Systems Group blake_gonzales@dell.com Agenda HPC Fault Tolerance

More information

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited Fujitsu HPC Roadmap Beyond Petascale Computing Toshiyuki Shimizu Fujitsu Limited Outline Mission and HPC product portfolio K computer*, Fujitsu PRIMEHPC, and the future K computer and PRIMEHPC FX10 Post-FX10,

More information

GROMACS Performance Benchmark and Profiling. August 2011

GROMACS Performance Benchmark and Profiling. August 2011 GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

NAMD GPU Performance Benchmark. March 2011

NAMD GPU Performance Benchmark. March 2011 NAMD GPU Performance Benchmark March 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory

More information

HYCOM Performance Benchmark and Profiling

HYCOM Performance Benchmark and Profiling HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities

More information

Cray events. ! Cray User Group (CUG): ! Cray Technical Workshop Europe:

Cray events. ! Cray User Group (CUG): ! Cray Technical Workshop Europe: Cray events! Cray User Group (CUG):! When: May 16-19, 2005! Where: Albuquerque, New Mexico - USA! Registration: reserved to CUG members! Web site: http://www.cug.org! Cray Technical Workshop Europe:! When:

More information

DESCRIPTION GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage teraflops About ScaleMP

DESCRIPTION GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage teraflops About ScaleMP DESCRIPTION The Auburn University College of Engineering Computational Fluid Dynamics Cluster is built using Dell M1000E Blade Chassis Server Platform. The Cluster will consist of (4) M1000E Blade Chassis

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

Introduction of Oakforest-PACS

Introduction of Oakforest-PACS Introduction of Oakforest-PACS Hiroshi Nakamura Director of Information Technology Center The Univ. of Tokyo (Director of JCAHPC) Outline Supercomputer deployment plan in Japan What is JCAHPC? Oakforest-PACS

More information

LAMMPS Performance Benchmark and Profiling. July 2012

LAMMPS Performance Benchmark and Profiling. July 2012 LAMMPS Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

High Performance Computing

High Performance Computing 21 High Performance Computing High Performance Computing Systems 21-2 HPC-1420-ISSE Robust 1U Intel Quad Core Xeon Server with Innovative Cable-less Design 21-3 HPC-2820-ISSE 2U Intel Quad Core Xeon Server

More information

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available M&A, Inc. Essentials Status Launched Expected Discontinuance Q2'15 Limited 3-year Warranty Extended Warranty Available for Purchase (Select Countries) On-Site Repair Available for Purchase (Select Countries)

More information

CP2K Performance Benchmark and Profiling. April 2011

CP2K Performance Benchmark and Profiling. April 2011 CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters K. Kandalla, A. Venkatesh, K. Hamidouche, S. Potluri, D. Bureddy and D. K. Panda Presented by Dr. Xiaoyi

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

NAMD Performance Benchmark and Profiling. February 2012

NAMD Performance Benchmark and Profiling. February 2012 NAMD Performance Benchmark and Profiling February 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes Data Sheet Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes Fast and Flexible Hyperconverged Systems You need systems that can adapt to match the speed of your business. Cisco HyperFlex Systems

More information

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr 19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME

More information

INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE

INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE www.iceotope.com DATA SHEET INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE BLADE SERVER TM PLATFORM 80% Our liquid cooling platform is proven to reduce cooling energy consumption by

More information

High Performance Computing with Accelerators

High Performance Computing with Accelerators High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing

More information

GW2000h w/gw175h/q F1 specifications

GW2000h w/gw175h/q F1 specifications Product overview The Gateway GW2000h w/ GW175h/q F1 maximizes computing power and thermal control with up to four hot-pluggable nodes in a space-saving 2U form factor. Offering first-class performance,

More information

RS U, 1-Socket Server, High Performance Storage Flexibility and Compute Power

RS U, 1-Socket Server, High Performance Storage Flexibility and Compute Power FS.COM SERVERS IDEAL FOR DATA CENTER, ENTERPRISE & ISP NETWORK SOLUTIONS RS-6388 2U, 1-Socket Server, High Performance Storage Flexibility and Compute Power Build A Better Network with RS-6388 RS-6388

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Broadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence

Broadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence TM Artificial Intelligence Server for Fraud Date: Q2 2017 Application: Artificial Intelligence Tags: Artificial intelligence, GPU, GTX 1080 TI HM Revenue & Customs The UK s tax, payments and customs authority

More information

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D. Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic

More information

NAMD Performance Benchmark and Profiling. November 2010

NAMD Performance Benchmark and Profiling. November 2010 NAMD Performance Benchmark and Profiling November 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox Compute resource - HPC Advisory

More information

Dell Solution for High Density GPU Infrastructure

Dell Solution for High Density GPU Infrastructure Dell Solution for High Density GPU Infrastructure 李信乾 (Clayton Li) 產品技術顧問 HPC@DELL Key partnerships & programs Customer inputs & new ideas Collaboration Innovation Core & new technologies Critical adoption

More information

Customer Success Story Los Alamos National Laboratory

Customer Success Story Los Alamos National Laboratory Customer Success Story Los Alamos National Laboratory Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory Case Study June 2010 Highlights First Petaflop

More information

FUJITSU PHI Turnkey Solution

FUJITSU PHI Turnkey Solution FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture

More information

S THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA,

S THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE. Presenter: Louis Capps, Solution Architect, NVIDIA, S7750 - THE MAKING OF DGX SATURNV: BREAKING THE BARRIERS TO AI SCALE Presenter: Louis Capps, Solution Architect, NVIDIA, lcapps@nvidia.com A TALE OF ENLIGHTENMENT Basic OK List 10 for x = 1 to 3 20 print

More information

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product

More information

Data Sheet Fujitsu Server PRIMERGY CX250 S2 Dual Socket Server Node

Data Sheet Fujitsu Server PRIMERGY CX250 S2 Dual Socket Server Node Data Sheet Fujitsu Server PRIMERGY CX250 S2 Dual Socket Server Node Data Sheet Fujitsu Server PRIMERGY CX250 S2 Dual Socket Server Node Datasheet for Red Hat certification Standard server node for PRIMERGY

More information