HPC Systems Overview. SDSC Summer Institute August 6-10, 2012 San Diego, CA. Shawn Strande Gordon Project Manager SAN DIEGO SUPERCOMPUTER CENTER

Size: px
Start display at page:

Download "HPC Systems Overview. SDSC Summer Institute August 6-10, 2012 San Diego, CA. Shawn Strande Gordon Project Manager SAN DIEGO SUPERCOMPUTER CENTER"

Transcription

1 HPC Systems Overview SDSC Summer Institute August 6-10, 2012 San Diego, CA Shawn Strande Gordon Project Manager

2 Trestles High Productivity System Targeted at modest scale jobs and Science Gateways Appro integrated 324 node AMD Magny-Cours cluster 120 GB Intel flash in each node used for OS and available as user scratch space QDR Fat Tree interconnect In production operation since December 2011 Funded by the NSF and available through the NSF Extreme Science and Engineering Discovery Environment program (XSEDE)

3 Trestles Design Objectives Target modest-scale users Limit job size to 1024 cores (32 nodes) Serve a large number of users Cap allocation per project at 1.5M core-hours/year (~2.5% of annual total) Gateways are exempt from this cap because they represent a large # of users Maintain fast turnaround time Primary system metric is expansion factor rather than traditional system utilization Allocate ~70% of the theoretically available core-hours (revised as needed) Reserve nodes for interactive and short jobs Respond to user s requirements Robust software suite Unique capabilities like pre-emptive on-demand access and user-settable reservations Long-running job queues (48 hours standard, up to 2 weeks allowed) Bring in new users/communities Welcome small jobs/allocations, start-up requests up to 50,000 SUs, gateway-friendly

4 Trestles Architecture Data Movers (4x) XSEDE & R&E Networks SDSC Network QDR IB GbE management GbE public Round robin login Mirrored NFS Redundant front-end Login s (2x) 4x Arista (2x MLAG) GbE switch 12x IB/Ethernet Bridge switch QDR InfiniBand Switch NFS Servers (4x) Gordon & Trestles Shared Gordon Cluster Data Oasis Lustre PFS 4 PB 324 Mgmt. s (2x) Ethernet Management GbE 10GbE QDR 40 Gb/s

5 Trestles Magny-Cours 120 GB QDR InfiniBand

6 Trestles Speeds and Feeds System Component Configuration AMD MAGNY-COURS COMPUTE NODE Sockets 4 Cores 32 Clock Speed 2.4 GHz Flop Speed 307 Gflop/s Memory capacity 64 GB Memory bandwidth 171 GB/s STREAM Triad bandwidth 100 GB/s Flash memory () 120 GB FULL SYSTEM Total compute nodes 324 Total compute cores 10,368 Peak performance 100 Tflop/s Total memory 20.7 TB Total memory bandwidth 55.4 TB/s Total flash memory 39 TB QDR INFINIBAND INTERCONNECT Topology Fat tree Link bandwidth 8 GB/s (bidrectional) Peak bisection bandwidth 5.2 TB/s (bidirectional) MPI latency 1.3 us DISK I/O SUBSYSTEMS File systems NFS, Lustre Storage capacity (usable) 400 TB: Scratch 400 TB : Project I/O bandwidth 12 GB/s

7 Gordon An Innovative Data-Intensive Supercomputer Designed to accelerate access to massive amounts of data in areas of genomics, earth science, engineering, medicine, and others Emphasizes memory and IO over FLOPS. Appro integrated 1,024 node Sandy Bridge cluster 300 TB of high performance Intel flash Large memory supernodes via vsmp Foundation from ScaleMP 3D torus interconnect from Mellanox In production operation since February 2012 Funded by the NSF and available through the NSF Extreme Science and Engineering Discovery Environment program (XSEDE)

8 The Memory Hierarchy of a Typical Supercomputer Shared memory Programming (single node) Message passing programming Latency Gap BIG DATA Disk I/O

9 The Memory Hierarchy of Gordon Shared memory Programming (vsmp) BIG DATA Disk I/O

10 Gordon Network Architecture Data Movers (4x) XSEDE & R&E Networks SDSC Network Dual-rail IB Dual 10GbE storage GbE management GbE public Round robin login Mirrored NFS Redundant front-end 4x Mgmt. s (2x) Mgmt. Edge & Core Ethernet Public Edge & Core Ethernet Login s (4x) Data Oasis Lustre PFS 4 PB 128x Arista 10 GbE switch IO s IO s 1,024 NFS Server (4x) 64 3D torus: rail 1 3D torus: rail 2 GbE 2x10GbE 10GbE QDR 40 Gb/s

11 Gordon Intel Sandy Bridge 11

12 Gordon Design Highlights 1,024 2S Xeon E5 (Sandy Bridge) nodes 16 cores, 64 GB/node Intel Jefferson Pass mobo PCI Gen3 3D Torus Dual rail QDR 300 GB Intel 710 emlc s 300 TB aggregate 64, 2S Westmere I/O nodes 12 core, 48 GB/node 4 LSI controllers 16 s Dual 10GbE SuperMicro mobo PCI Gen2 Large Memory vsmp Supernodes 2TB DRAM 10 TB Flash Data Oasis Lustre PFS 100 GB/sec, 4 PB

13 Gordon 32-way Supernode vsmp aggregation SW ION ION Dual WM IOP 4.8 TB flash Dual WM IOP 4.8 TB flash

14 Supernode Network Design Detail I/O nodes serve flash and are the gateway to Data Oasis (Lustre file system

15 Gordon 3D Torus Interconnect Fabric 4x4x4 3D Torus Topology 4X4X4 Mesh Ends are folded on all three Dimensions to form a 3DTorus Dual-Rail Network increased Bandwidth & Redundancy Single Connection to each Network 16 s, 2 IO s 18 x 4X IB Network Connections 18 x 4X IB Network Connections 36 Port Fabric Switch IO IO 36 Port Fabric Switch

16 Why a 3D Torus interconnect? Lower Cost :40% as many switches, 25% to 50% fewer cables compared to a fat tree Works well for localized communication Linearly expandable Simple wiring pattern Short Cables- Fiber Optic cables generally not required Fault Tolerant within the mesh with 2QoS Alternate Routing Fault Tolerant with Dual-Rails for all routing algorithms Based on OFED IB stack

17 Gordon System Layout (same basic design for Trestles) 16 Racks (all racks 48U) 1 Service Rack Hot aisle containment 4 I/O Racks 500kW Earthquake isobases

18 Gordon Speeds and Feeds System Component INTEL SANDY BRIDGE COMPUTE NODE Configuration Sockets, Cores 2 / 16 Clock speed 2.6 DRAM capacity 64 GB 80 GB Peak performance 333 GF Memory bandwidth 85 GB/s capacity per drive/capacity per node/total Flash bandwidth per drive (read/write) IOPS INTEL FLASH I/O NODE 300 GB / 4.8 TB / 300 TB 270 MB/s / 210 MB/s 38,000 / 2,300 SMP SUPER-NODE nodes 16 or 32 I/O nodes 1 or 2 Addressable DRAM 1 or 2 TB Flash 4.8 or 9.6 TB FULL SYSTEM s, Cores 1,024 / 16,394 Peak performance 341 TF Aggregate memory 64 TB INFINIBAND INTERCONNECT Aggregate torus BW 9.2 TB/s Type Dual-Rail QDR InfiniBand Link Bandwidth 8 GB/s (bidirectional) Latency (min-max) 1.25 µs 2.5 µs Total storage I/O bandwidth DATA OASIS LUSTRE FILE SYSTEM 4.5 PB (raw) 100 GB/s Gordon System Specification

19 Software Stack Consistent Across Systems for Ease of Use and Support Cluster management Rocks Operating System InfiniBand OFED MPI CentOS5.6 modified for AVX Mellanox subnet manager MVAPICH2 (Native) MPICH2 (vsmp) Shared Memory (Gordon) vsmp Foundation v4 Flash (Gordon) iscsi over RDMA (iser) Target daemon (tgtd) XFS, OCFS, et al Parallel File System Lustre Job scheduling Torque (PBS), Catalina Local enhancements for topology aware scheduling (Gordon) User Environment Modules; Rocks Rolls

20 Gordon Measured Performance System LINPACK 286 Tflop/s (84% of peak) node LINPACK 290 Gflop/s (85% of peak) Memory bandwidth 67 GB/s (79% of peak) I/O node 1 sequential performance when mounted on a compute node using XFS 16 sequential remote (r/w) with iser and no file system (using 2 rails) 16 IOPS local (r/w) 16 IOPS remote (r/w) 270/210 MB/s r/w (100% of spec) 4.5/4.4 GB/s (exceeds spec) 600K/120K (exceed spec) 160K/32K Interconnect Link bandwidth Rail 0: 3.9 GB/s Rail 1: 3.4 GB/s Latency (for traversing the maximum number of switch hops) Rail 0: 1.44 µs Rail 1: 2.14 µs Data Oasis From a single LNET (I/O node) router 1.6 GB/s write 5 GB/s read From 32 LNET routers to 32 Lustre OSS s (half of total system) 50 GB/s

21 s are a good fit for dataintensive computing Flash Drive Typical HDD Good for Data Intensive Apps Latency <.1 ms 10 ms Bandwidth (r/w) 270 / 210 MB/s MB/s IOPS (r/w) 38,500 / Power consumption (when doing r/w) 2-5 W 6-10 W Price/GB $3/GB $.50/GB -. Endurance 2-10PB N/A Total Cost of Ownership Jury is still out.

22 Flash: Option 1 - One per Lustre One 300 GB flash drive exported to each compute node appears as a local file system Lustre parallel file system is mounted identically on all nodes. Data is purged at the end of the run Use cases: Applications that need local, temporary scratch Gaussian Abaqus Logical View File system appears as: /scratch/$user/$pbs_jobid

23 Flash: Option 2-16 s for 1 Lustre 4.8 TB 16 s in a RAID0 appear as a single 4.8 TB file system to the compute node. Flash I/O and Lustre traffic uses Rail 1 of the torus. Use cases: Database Data mining Gaussian Abaqus Logical View File system appears as: /scratch/$user/$pbs_jobid

24 Flash: Option 3-16 s within a vsmp Supernode Lustre Logical View 16 node Virtual Image (1 TB) 4.8 TB file system File system appears as: /scratch1/$user/$pbs_jobid (/scratch2 available if using a 32-node supernode) 4.8 TB flash as a single XFS file system Flash I/O uses both rail 0 and rail 1 Data purged at the end of a run Use cases: Serial and threaded applications that need large memory and local disk Abaqus Genomics (Velvet, Allpaths, etc)

25 Flash: Option 4 - Dedicated I/O node ***A new allocations model for data intensive computing *** Lustre 4.8 TB I/O node is allocated for up to one year Users run applications directly on the I/O node Users may request dedicated compute nodes or access them through the scheduler Data is persistent Logical View Use cases: Database Data mining and analytics

26 Flash Option 5: 16 s/ 16 compute node (Testing various options please talk with us) 4.8 TB Lustre Logical View 16 s in a RAID0 appear as a single 4.8 TB file system to all compute nodes Use cases: MPI applications

27 Which I/O system is right for my application? Performance Flash-based I/O nodes s support low latency I/O, high IOPS, and high bandwidth. One can deliver 37K IOPS. Flash resources are dedicated to the user and performance is largely independent of what other users are doing on the system. Lustre (Data Oasis) Lustre is ubiquitous in HPC. It does well for sequential I/O and files that support I/O to a few files from many cores simultaneously. Random I/O is a Lustre killer. Lustre is a shared resource and performance will vary depending on what other users are doing. Infrastructure Persistence s are deployed in I/O nodes using iser, an RDMA protocol that is accessed over the InfiniBand network. Data is generally removed at the end of a run so the resource can be made available to the next job. 64 OSS s; distinct file systems and metadata servers; accessed over a 10GbE network via the I/O nodes HDDs. Most is deployed as scratch and purgeable by policy (not necessarily at the end of the job. Some deployed as a persistent project storage resource. Capacity Use cases Up to 4.8 TB per user depending on configuration Local application scratch (Abaqus, Gaussian); as a data mining platform (e.g., Hadoop); graph problems. No specific limits or quotas imposed on scratch. File system is ~ 2 PB. Traditional HPC I/O associated with MPI applications. Pre-staging of data that will be pulled into flash.

28 System Gordon/Trestles Comparison nodes Trestles 324 node AMD Magny-Cours Appro 10,368 cores 10 TB DRAM 100 Tflop/s Quad socket 32 cores (8/proc) 64 GB/node (2GB/core) 2.4 GHz Gordon 1,024 node Intel Xeon E5 Appro 16,384 cores 64 TB DRAM 341 Tflop/s Dual socket 16 cores (8 cores/proc) 64 GB/node (4 GB/core) 2.6 GHz Flash memory ~ 60 GB/node MLC ~300GB/node emlc (4.8TB aggregated per I/O node) Interconnect Target user Fat Tree Single rail QDR Small & modest size jobs; gateways; rapid turnaround; allocations < 1.5MSUs 3D Torus Dual Rail QDR Memory and I/O intensive applications; modest sized jobs; gateways; no specific limit on allocations

29 Data Oasis High performance, high capacity Lustre-based parallel file system A centerpiece of SDSC s data intensive infrastructure 10GbE I/O backbone for all of SDSC s HPC systems, supporting multiple architectures Scalable, open platform design Integrated by Aeon Computing 4PB capacity and growing 100 GB/s demonstrated performance Configured as individual scratch and project (nonpurged) file systems for each system

30 Data Oasis Heterogeneous Architecture TRESTLES IB cluster GORDON IB cluster TRITON Myrinet cluster Mellanox 5020 Bridge 12 GB/s 64 Lustre LNET Routers 100 GB/s Myrinet 10G Switch 25 GB/s 3 Distinct Network Architectures MDS: Trestles scratch MDS: Gordon scratch Arista G Arista G Redundant Switches for Reliability and Performance MDS: Gordon & Trestles project MDS: Triton scratch OSS 72TB OSS 72TB OSS 72TB OSS 72TB 64 OSS (Object Storage Servers) Provide 100GB/s Performance and >4PB Raw Capacity Metadata Servers JBOD 90TB JBOD 90TB JBOD 90TB JBOD 90TB JBODs (Just a Bunch Of Disks) Provide Capacity Scale-out to an Additional 5.8PB

31 Data Oasis Performance Measured from Gordon

32 Applications

33 Breadth First Search Comparison using and HDD Graphs are mathematical and computational representations of relationships of objects in a network. Such networks occur in many natural and man-made scenarios, including communication, biological, and social contexts. Understanding the structure of these graphs is important for uncovering important relationships among the members. Implementation of Breadth-first search (BFS) graph algorithm developed by Munagala and Ranade 134 million nodes Flash drives reduced I/O time by factor of 6.5x Problem converted from I/O bound to compute bound V M F Source: Sandeep Gupta, San Diego Supercomputer Center. Used by permission C T L

34 Foxglove Calculation using Gaussian 09 with vsmp - MP2 Energy Gradient Calculation The Foxglove plant (Digitalis) is studied for its medicinal uses. Digoxin, an extract of the Foxglove, is used to treat a variety of conditions including diseases of the heart. There is some recent research that suggests it may also be a beneficial cancer treatment. Processor footprint - 4 nodes 64 threads 1 node = (16 cores/node) 64 GB/node) Time to solution: 43,000s Memory footprint 10 nodes 700 GB V M F C T L Source: Jerry Greenberg, San Diego Supercomputer Center. January, 2012.

35 Daphnia Genome Assembly using Velvet and vsmp Daphnia (a.k.a. water flea), is a model species used for understanding mechanisms of inheritance and evolution, and as a surrogate species for studying human health in responses to environmental changes. Photo: Dr. Jan Michels, Christian-Albrechts-University, Kiel De novo assembly of short DNA reads using the de Bruijn graph algorithm. Code parallelized using OpenMP directives. Benchmark problem: Daphnia genome assembly from 44-bp and 75-bp reads using 35-mer Source: Wayne Pfeiffer, San Diego Supercomputer Center. Used by permission. V M F C T L

36 Axial compression of caudal rat vertebra using Abaqus and vsmp The goal of the simulations is to analyze how small variances in boundary conditions effect high strain regions in the model. The research goal is to understand the response of trabecular bone to mechanical stimuli. This has relevance for paleontologists to infer habitual locomotion of ancient people and animals, and in treatment strategies for populations with fragile bones such as the elderly. 5 million quadratic, 8 noded elements Model created with custom Matlab application that converts 25 3 micro CT images into voxelbased finite element models Source: Matthew Goff, Chris Hernandez. Cornell University. Used by permission. 2012

37 MrBayes Running on Trestles and Gordon through the CIPRES Gateway MrBayes is used extensively via the CIPRES Science Gateway to infer phylogenetic trees. The hybrid parallel version running at SDSC uses both MPI and OpenMP. CIPRES has allowed over 4000 biologists world-wide to run parallel tree inference codes via a simple-to-use web interface. Applications can be targeted to appropriate architectures. Gordon provides a significant speedup for unpartitioned data sets over the SDSC Trestles system. A model for future data intensive projects Source: Wayne Pfeiffer, San Diego Supercomputer Center.

38 Conclusions Trestles meets the needs of a majority of users. Small and modest job sizes, rapid turnaround. Gordon is a data intensive system for users requiring large memory, or I/O characterized by many small transactions or random access patterns. Both Trestles and Gordon are gateway friendly. Data Oasis is a high performance, high capacity parallel file system that is available on both Trestles and Gordon. Project storage is available as an allocable resource. An expansion of and additional 2 PB is planned for early It is cross mounted on Trestles and Gordon. High speed networks and data movers to SDSC systems facilitate large file transfers to/from home institutions. A wide range of applications are running on both systems that make use of their innovative features.

39 Thank you

Gordon - Design and Performance of a 3D Torus Interconnect for Data Intensive Computing

Gordon - Design and Performance of a 3D Torus Interconnect for Data Intensive Computing Gordon - Design and Performance of a 3D Torus Interconnect for Data Intensive Computing HPC Advisory Council Held in Conjunction with ISC 12 June 17, 2012 Hamburg, Germany Shawn Strande Gordon Project

More information

High Performance Computing and Data Resources at SDSC

High Performance Computing and Data Resources at SDSC High Performance Computing and Data Resources at SDSC "! Mahidhar Tatineni (mahidhar@sdsc.edu)! SDSC Summer Institute! August 05, 2013! HPC Resources at SDSC Hardware Overview HPC Systems : Gordon, Trestles

More information

Gordon: Design, Performance, & Experiences Deploying & Supporting a Data-Intensive Supercomputer

Gordon: Design, Performance, & Experiences Deploying & Supporting a Data-Intensive Supercomputer Gordon: Design, Performance, & Experiences Deploying & Supporting a Data-Intensive Supercomputer XSEDE 12 July 16-19, 2012 Chicago, IL Shawn Strande Gordon Project Manager San Diego Supercomputer Center

More information

Using SDSC Systems (part 2)

Using SDSC Systems (part 2) Using SDSC Systems (part 2) Running vsmp jobs, Data Transfer, I/O SDSC Summer Institute August 6-10 2012 Mahidhar Tatineni San Diego Supercomputer Center " 1 vsmp Runtime Guidelines: Overview" Identify

More information

A New NSF TeraGrid Resource for Data-Intensive Science

A New NSF TeraGrid Resource for Data-Intensive Science A New NSF TeraGrid Resource for Data-Intensive Science Michael L. Norman Principal Investigator Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist Slide 1 Coping with the data deluge

More information

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.

More information

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap

More information

Nowlab, OSU Booth Talk SC18, Dallas

Nowlab, OSU Booth Talk SC18, Dallas Overview - Last 10 Years of HPC Architectures and Science Enabled at SDSC Amit Majumdar Division Director, Data Enabled Scientific Computing Division San Diego Supercomputer Center University of California

More information

Comet Virtualization Code & Design Sprint

Comet Virtualization Code & Design Sprint Comet Virtualization Code & Design Sprint SDSC September 23-24 Rick Wagner San Diego Supercomputer Center Meeting Goals Build personal connections between the IU and SDSC members of the Comet team working

More information

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.

More information

SuperMike-II Launch Workshop. System Overview and Allocations

SuperMike-II Launch Workshop. System Overview and Allocations : System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of

More information

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D. Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic

More information

MAHA. - Supercomputing System for Bioinformatics

MAHA. - Supercomputing System for Bioinformatics MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

SDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication

SDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication SDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication Rick Wagner HPC Systems Manager San Diego Supercomputer Center Comet HPC for the long tail of science iphone panorama photograph of 1 of 2 server rows

More information

Lecture 20: Distributed Memory Parallelism. William Gropp

Lecture 20: Distributed Memory Parallelism. William Gropp Lecture 20: Distributed Parallelism William Gropp www.cs.illinois.edu/~wgropp A Very Short, Very Introductory Introduction We start with a short introduction to parallel computing from scratch in order

More information

DESCRIPTION GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage teraflops About ScaleMP

DESCRIPTION GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage teraflops About ScaleMP DESCRIPTION The Auburn University College of Engineering Computational Fluid Dynamics Cluster is built using Dell M1000E Blade Chassis Server Platform. The Cluster will consist of (4) M1000E Blade Chassis

More information

Enabling Phylogenetic Research via the CIPRES Science Gateway!

Enabling Phylogenetic Research via the CIPRES Science Gateway! Enabling Phylogenetic Research via the CIPRES Science Gateway Wayne Pfeiffer SDSC/UCSD August 5, 2013 In collaboration with Mark A. Miller, Terri Schwartz, & Bryan Lunt SDSC/UCSD Supported by NSF Phylogenetics

More information

HPC Capabilities at Research Intensive Universities

HPC Capabilities at Research Intensive Universities HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)

More information

ABySS Performance Benchmark and Profiling. May 2010

ABySS Performance Benchmark and Profiling. May 2010 ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

HPC Hardware Overview

HPC Hardware Overview HPC Hardware Overview John Lockman III April 19, 2013 Texas Advanced Computing Center The University of Texas at Austin Outline Lonestar Dell blade-based system InfiniBand ( QDR) Intel Processors Longhorn

More information

STAR-CCM+ Performance Benchmark and Profiling. July 2014

STAR-CCM+ Performance Benchmark and Profiling. July 2014 STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

NAMD Performance Benchmark and Profiling. January 2015

NAMD Performance Benchmark and Profiling. January 2015 NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

Mapping MPI+X Applications to Multi-GPU Architectures

Mapping MPI+X Applications to Multi-GPU Architectures Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under

More information

FUJITSU PHI Turnkey Solution

FUJITSU PHI Turnkey Solution FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection

More information

Extremely Fast Distributed Storage for Cloud Service Providers

Extremely Fast Distributed Storage for Cloud Service Providers Solution brief Intel Storage Builders StorPool Storage Intel SSD DC S3510 Series Intel Xeon Processor E3 and E5 Families Intel Ethernet Converged Network Adapter X710 Family Extremely Fast Distributed

More information

Emerging Technologies for HPC Storage

Emerging Technologies for HPC Storage Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional

More information

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K 1. ALMA Pipeline Cluster specification The following document describes the recommended hardware for the Chilean based cluster for the ALMA pipeline and local post processing to support early science and

More information

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D. Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY John Fragalla Presenter s Name Title and Division Sun Microsystems Principle Engineer High Performance

More information

Virtualization for High Performance Computing Applications at SDSC Mahidhar Tatineni, Director of User Services, SDSC HPC Advisory Council China

Virtualization for High Performance Computing Applications at SDSC Mahidhar Tatineni, Director of User Services, SDSC HPC Advisory Council China Virtualization for High Performance Computing Applications at SDSC Mahidhar Tatineni, Director of User Services, SDSC HPC Advisory Council China Conference October 26, 2016, Xi an, China Comet Virtualization

More information

InfiniBand Networked Flash Storage

InfiniBand Networked Flash Storage InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine April 2007 Part No 820-1270-11 Revision 1.1, 4/18/07

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

AcuSolve Performance Benchmark and Profiling. October 2011

AcuSolve Performance Benchmark and Profiling. October 2011 AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox, Altair Compute

More information

Deep Learning Performance and Cost Evaluation

Deep Learning Performance and Cost Evaluation Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Rene Meyer, Ph.D. AMAX Corporation Publish date: October 25, 2018 Abstract Introduction

More information

DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage

DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage Solution Brief DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage DataON Next-Generation All NVMe SSD Flash-Based Hyper-Converged

More information

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia

More information

AcuSolve Performance Benchmark and Profiling. October 2011

AcuSolve Performance Benchmark and Profiling. October 2011 AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute

More information

Performance of Applications on Comet GPU Nodes Utilizing MVAPICH2-GDR. Mahidhar Tatineni MVAPICH User Group Meeting August 16, 2017

Performance of Applications on Comet GPU Nodes Utilizing MVAPICH2-GDR. Mahidhar Tatineni MVAPICH User Group Meeting August 16, 2017 Performance of Applications on Comet GPU Nodes Utilizing MVAPICH2-GDR Mahidhar Tatineni MVAPICH User Group Meeting August 16, 2017 This work supported by the National Science Foundation, award ACI-1341698.

More information

Voltaire Making Applications Run Faster

Voltaire Making Applications Run Faster Voltaire Making Applications Run Faster Asaf Somekh Director, Marketing Voltaire, Inc. Agenda HPC Trends InfiniBand Voltaire Grid Backbone Deployment examples About Voltaire HPC Trends Clusters are the

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015

Altair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015 Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

HPC NETWORKING IN THE REAL WORLD

HPC NETWORKING IN THE REAL WORLD 15 th ANNUAL WORKSHOP 2019 HPC NETWORKING IN THE REAL WORLD Jesse Martinez Los Alamos National Laboratory March 19 th, 2019 [ LOGO HERE ] LA-UR-19-22146 ABSTRACT Introduction to LANL High Speed Networking

More information

Design and Evaluation of a 2048 Core Cluster System

Design and Evaluation of a 2048 Core Cluster System Design and Evaluation of a 2048 Core Cluster System, Torsten Höfler, Torsten Mehlan and Wolfgang Rehm Computer Architecture Group Department of Computer Science Chemnitz University of Technology December

More information

Deep Learning Performance and Cost Evaluation

Deep Learning Performance and Cost Evaluation Micron 5210 ION Quad-Level Cell (QLC) SSDs vs 7200 RPM HDDs in Centralized NAS Storage Repositories A Technical White Paper Don Wang, Rene Meyer, Ph.D. info@ AMAX Corporation Publish date: October 25,

More information

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Providing Superior Server and Storage Performance, Efficiency and Return on Investment As Announced and Demonstrated at

More information

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer Oracle Exadata X7 Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer 05.12.2017 Oracle Engineered Systems ZFS Backup Appliance Zero Data Loss Recovery Appliance Exadata Database

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

An Oracle White Paper June Enterprise Database Cloud Deployment with Oracle SuperCluster T5-8

An Oracle White Paper June Enterprise Database Cloud Deployment with Oracle SuperCluster T5-8 An Oracle White Paper June 2013 Enterprise Database Cloud Deployment with Oracle SuperCluster T5-8 Introduction Databases form the underlying foundation for most business applications by storing, organizing,

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

Cloudian Sizing and Architecture Guidelines

Cloudian Sizing and Architecture Guidelines Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary

More information

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,

More information

NAMD Performance Benchmark and Profiling. February 2012

NAMD Performance Benchmark and Profiling. February 2012 NAMD Performance Benchmark and Profiling February 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -

More information

Isilon Performance. Name

Isilon Performance. Name 1 Isilon Performance Name 2 Agenda Architecture Overview Next Generation Hardware Performance Caching Performance Streaming Reads Performance Tuning OneFS Architecture Overview Copyright 2014 EMC Corporation.

More information

SurFS Product Description

SurFS Product Description SurFS Product Description 1. ABSTRACT SurFS An innovative technology is evolving the distributed storage ecosystem. SurFS is designed for cloud storage with extreme performance at a price that is significantly

More information

Introduction to High-Speed InfiniBand Interconnect

Introduction to High-Speed InfiniBand Interconnect Introduction to High-Speed InfiniBand Interconnect 2 What is InfiniBand? Industry standard defined by the InfiniBand Trade Association Originated in 1999 InfiniBand specification defines an input/output

More information

HYCOM Performance Benchmark and Profiling

HYCOM Performance Benchmark and Profiling HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

DDN About Us Solving Large Enterprise and Web Scale Challenges

DDN About Us Solving Large Enterprise and Web Scale Challenges 1 DDN About Us Solving Large Enterprise and Web Scale Challenges History Founded in 98 World s Largest Private Storage Company Growing, Profitable, Self Funded Headquarters: Santa Clara and Chatsworth,

More information

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Feedback on BeeGFS. A Parallel File System for High Performance Computing Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December

More information

IBM CORAL HPC System Solution

IBM CORAL HPC System Solution IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy

More information

The Hyperion Project: Collaboration for an Advanced Technology Cluster Testbed. November 2008

The Hyperion Project: Collaboration for an Advanced Technology Cluster Testbed. November 2008 1 The Hyperion Project: Collaboration for an Advanced Technology Cluster Testbed November 2008 Extending leadership to the HPC community November 2008 2 Motivation Collaborations Hyperion Cluster Timeline

More information

Architecting High Performance Computing Systems for Fault Tolerance and Reliability

Architecting High Performance Computing Systems for Fault Tolerance and Reliability Architecting High Performance Computing Systems for Fault Tolerance and Reliability Blake T. Gonzales HPC Computer Scientist Dell Advanced Systems Group blake_gonzales@dell.com Agenda HPC Fault Tolerance

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

Technical Computing Suite supporting the hybrid system

Technical Computing Suite supporting the hybrid system Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect

More information

An Overview of Fujitsu s Lustre Based File System

An Overview of Fujitsu s Lustre Based File System An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu

More information

Organizational Update: December 2015

Organizational Update: December 2015 Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2

More information

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,

More information

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015 LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday

More information

LS-DYNA Performance Benchmark and Profiling. April 2015

LS-DYNA Performance Benchmark and Profiling. April 2015 LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

NVM Express over Fabrics Storage Solutions for Real-time Analytics

NVM Express over Fabrics Storage Solutions for Real-time Analytics NVM Express over Fabrics Storage Solutions for Real-time Analytics Presented by Paul Prince, CTO Santa Clara, CA 1 NVMe Over Fabrics NVMf Why do we need NVMf? What is it? How does it fit in the Market?

More information

Application Acceleration Beyond Flash Storage

Application Acceleration Beyond Flash Storage Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

CSCS HPC storage. Hussein N. Harake

CSCS HPC storage. Hussein N. Harake CSCS HPC storage Hussein N. Harake Points to Cover - XE6 External Storage (DDN SFA10K, SRP, QDR) - PCI-E SSD Technology - RamSan 620 Technology XE6 External Storage - Installed Q4 2010 - In Production

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

Introduction to Xeon Phi. Bill Barth January 11, 2013

Introduction to Xeon Phi. Bill Barth January 11, 2013 Introduction to Xeon Phi Bill Barth January 11, 2013 What is it? Co-processor PCI Express card Stripped down Linux operating system Dense, simplified processor Many power-hungry operations removed Wider

More information

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

NetApp: Solving I/O Challenges. Jeff Baxter February 2013 NetApp: Solving I/O Challenges Jeff Baxter February 2013 1 High Performance Computing Challenges Computing Centers Challenge of New Science Performance Efficiency directly impacts achievable science Power

More information

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA STATE OF THE ART 2012 18,688 Tesla K20X GPUs 27 PetaFLOPS FLAGSHIP SCIENTIFIC APPLICATIONS

More information

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure

More information

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Architecting Storage for Semiconductor Design: Manufacturing Preparation White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data

More information

Cluster Network Products

Cluster Network Products Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster

More information

The Architecture and the Application Performance of the Earth Simulator

The Architecture and the Application Performance of the Earth Simulator The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

CS500 SMARTER CLUSTER SUPERCOMPUTERS

CS500 SMARTER CLUSTER SUPERCOMPUTERS CS500 SMARTER CLUSTER SUPERCOMPUTERS OVERVIEW Extending the boundaries of what you can achieve takes reliable computing tools matched to your workloads. That s why we tailor the Cray CS500 cluster supercomputer

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information