Diamond Networks/Computing. Nick Rees January 2011

Similar documents
Computing and Networking at Diamond Light Source. Mark Heron Head of Control Systems

SAP High-Performance Analytic Appliance on the Cisco Unified Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

Storage Systems Market Analysis Dec 04

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Storage Optimization with Oracle Database 11g

Cisco HyperFlex HX220c M4 Node

Broadberry. Artificial Intelligence Server for Fraud. Date: Q Application: Artificial Intelligence

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Cisco HyperFlex HX220c Edge M5

IBM Power AC922 Server

IBM Power Advanced Compute (AC) AC922 Server

Parallel File Systems. John White Lawrence Berkeley National Lab

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

IBM System x3455 AMD Opteron SMP 1 U server features Xcelerated Memory Technology to meet the needs of HPC environments

Architecting Storage for Semiconductor Design: Manufacturing Preparation

IBM xseries Technical High Performance Servers V2.

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

Planning for Liquid Cooling Patrick McGinn Product Manager, Rack DCLC

HostEngine 5URP24 Computer User Guide

IBM System x family brochure

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Overview of Tianhe-2

IBM System x3850 M2 servers feature hypervisor capability

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Isilon Performance. Name

Sun Fire V880 System Architecture. Sun Microsystems Product & Technology Group SE

LQCD Facilities at Jefferson Lab. Chip Watson May 6, 2011

Mellanox Virtual Modular Switch

IBM Power Systems HPC Cluster

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances

IBM System x family brochure

Architecting High Performance Computing Systems for Fault Tolerance and Reliability

The Optimal CPU and Interconnect for an HPC Cluster

MAHA. - Supercomputing System for Bioinformatics

Sugon TC6600 blade server

Xyratex ClusterStor6000 & OneStor

Lowering Cost per Bit With 40G ATCA

SCHOOL OF PHYSICAL, CHEMICAL AND APPLIED SCIENCES

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

MegaGauss (MGs) Cluster Design Overview

How to Build a Cluster

Model Configurations. Contains: Hardware RAID with JBOD Expansion. HBA Mode with JBOD Expansion. JBOD Expansion

Intel Workstation Technology

High Performance Computing

The list below shows items not included with a SmartVDI-110 Server. Monitors Ethernet cables (copper) Fiber optic cables Keyboard and mouse

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

COMPARING COST MODELS - DETAILS

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

TSUBAME-KFC : Ultra Green Supercomputing Testbed

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

Quotations invited. 2. The supplied hardware should have 5 years comprehensive onsite warranty (24 x 7 call logging) from OEM directly.

I/O Channels. RAM size. Chipsets. Cluster Computing Paul A. Farrell 9/8/2011. Memory (RAM) Dept of Computer Science Kent State University 1

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012

Isilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division

CIT 668: System Architecture. Scalability

Open storage architecture for private Oracle database clouds

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

JMR ELECTRONICS INC. WHITE PAPER

CPMD Performance Benchmark and Profiling. February 2014

SNAP Performance Benchmark and Profiling. April 2014

Assessing performance in HP LeftHand SANs

The power of centralized computing at your fingertips

Acer AR320 F2 Specifications

IBM System x servers. Innovation comes standard

Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers

Certification Document Supermicro SSG-6027R-E1R12T 04/30/2014. Supermicro SSG-6027R-E1R12T system

HPC Architectures. Types of resource currently in use

Suggested use: infrastructure applications, collaboration/ , web, and virtualized desktops in a workgroup or distributed environments.

Cisco Connected Safety and Security UCS C220

RS U, 1-Socket Server, High Performance Storage Flexibility and Compute Power

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

HP solutions for mission critical SQL Server Data Management environments

Block Lanczos-Montgomery Method over Large Prime Fields with GPU Accelerated Dense Operations

IBM System x3550 servers feature new Intel quad-core processors Application density for power managed datacenters

Mellanox Technologies Delivers The First InfiniBand Server Blade Reference Design

HIGH PERFORMANCE COMPUTING FROM SUN

ARISTA: Improving Application Performance While Reducing Complexity

Dell EMC PowerEdge R740xd as a Dedicated Milestone Server, Using Nvidia GPU Hardware Acceleration

Kronos File Optimizing Performance

Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective

Rack Systems. OpenBlade High-Density, Full-Rack Solution Optimized for CAPEX & OPEX

ZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1

About 2CRSI. OCtoPus Solution. Technical Specifications. OCtoPus. OCP Solution by 2CRSI.

CISCO MEDIA CONVERGENCE SERVER 7815-I1

IBM System x3550 servers feature a new Intel quad-core processor Application density for power-managed datacenters

Cisco UCS C210 M1 General-Purpose Rack-Mount Server

SU Dual and Quad-Core Xeon UP Server

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

Efficient Object Storage Journaling in a Distributed Parallel File System

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Transcription:

Diamond Networks/Computing Nick Rees January 2011

2008 computing requirements Diamond originally had no provision for central science computing. Started to develop in 2007-2008, with a major development in 2008: Resilient high density computer room. Compute cluster with minimum 1500 SPECfp_rate2006 total. Resilient network with dual 10 Gbit core. 200 Tbytes storage with 20 Mbytes/sec scalable aggregated throughput and 8000 metadata operations/sec.

Overall a success Computer room Has up to 320 kw redundant power, (A and B) from two separate sub-stations. Power from A is UPS and generator backed up. Has up to 320 kw cooling water. Primary cooling is from site chilled water, 220 kw standby chiller in case of problems. Ran commissioning tests at 160 kw power load satisfactorily. Standby system has proved its worth a number of times.

Layout Cable Tray Water pipes

Testing with heat loads

Temperature rise on flow fail (0.5C/sec)

Now

This was a success. Easy to specify Compute cluster One supplier validated system against specification. They were the successful supplier A few teething problems Network driver had problems being both IPMI and normal interface at same time Sometimes need multiple restarts to get network I/F configure correctly. Problem fixed with updated Ethernet firmware and new switch hardware.

Computing solution 1U twin motherboard systems Each 1U motherboard is: Supermicro X7DWT Intel 5400 (Seaburg) Chipset Dual Intel Xeon E5420 (Quad Core, 2.5GHz) Processors 16GB DDR2-667 ECC RAM Installed as 4x4GB FBDIMMs 160GB SATA Hard Disk 1x16 PCI-Express slot (for GPU extension) Total 20 systems, 16 cores/system, 320 cores.

Current compute clusters Now have multiple compute clusters: Original IBM MX cluster New Supermicro cluster, 320 cores Accelerator Physics cluster, 240 Supermicro cores, but with InfiniBand interconnect. Nvidia Tesla cluster with Supermicro frontends for tomography. Feb 2010 Diamond Control System

Network Overall, another success Two core switches (from different vendors), in separate computer rooms Each beamline connected to both switches so either 2GBit or 20 GBit peak bandwidth available halved if one core switch fails Traffic routed by vendor-independent protocols OSPF, ECMP and LACP Cluster switch connected with 2 x 10 GBit to each switch. Storage system connected directly into core. Rack-top switches in both computer rooms also connected to both cores.

Network Layout

Network Layout Phase 1 Beamlines Phase 2 Beamlines Beamline Switch 1 Gbit/s Beamline Switch 1 Gbit/s 2 Gbit/s Disks 0.5 Gbit/s 10 Gbit/s 10 Gbit/s Central 40 Gbit/s Switch 80 Gbit/s Cluster Switch Disks 40x1 Gbit/s Cluster 60 Gbit/s Feb 2010 Diamond Control System

Network issues At high data rates packets get lost Standard TCP back off time is 200 ms, which is a lot of data at 10 Gbit/s See the problem mostly when 10 Gbit servers write to 1 Gbit clients. Other sites see similar issues use your favorite search engine to look up Data Center Ethernet or Data Center Bridging

Storage Requirements 200TBytes usable disk space 20 Mbytes/sec/Tbyte scalable aggregated throughput. 4 GBytes/sec aggregated throughput for 200 Tbytes. 100 MBytes/sec transfer rate for individual 1 Gbit clients 400 MBytes/sec for individual 10 Gbit clients 8000 metadata operations/sec Highly resilient Support for Linux clients (RHEL4U6 and RHEL5 or later) POSIX with extended attributes and ACLs Groups for file access control based (> 256 groups/user) Ethernet Clients. Extendable by at least 200TB for future growth.

Storage - solution Based on Data Direct Networks S2A9900 system Up to 5.7 GB/s throughput Fault-tolerant architecture Runs Lustre file system Open source file system acquired by Sun in 2008, and now part of Oracle Most popular file system in top 500 supercomputers

In initial tests: Storage - problems Sometimes got near 4 Gbytes/sec writing Only ever got about 3 Gbytes/sec reading Only got 8000 metadata operations/sec for the specific case of sequential creates of zero length files. Other metadata operations were slower (~2000/sec). In reality with real users: Primarily seem to be limited by metadata rates 8000 operations/sec specification based on 40 clients creating 200 files/sec. In practice a detector takes 3-5 metadata operations to create a file. The total rates also conceal that the latency for some individual files can be up to 15 seconds under load. This may now be fixed...

Storage status: Data sizes by beamline

Data growth by beamline 3,5E+10 b16 3E+10 2,5E+10 2E+10 1,5E+10 1E+10 5E+09 0 02/01/2007 02/01/2008 01/01/2009 01/01/2010 01/01/2011 b18 b22 b23 i02 i03 i04 i04-1 i06 i06-1 i07 i10 i11 i12 i15 i16 i18 i19 i20 i22 i24

Total data growth

Where to next? Data sizes 1000000,00 100000,00 Total Size of DIam mond Data (TiB) 10000,00 1000,00 100,00 10,00 Decreasing Exponential Model Best fit cubic Total Data Staff Data 1,00 0,10 janv. 07 janv. 09 janv. 11 janv. 13 janv. 15 janv. 17 janv. 19 janv. 21 janv. 23 Date

Where to next? - data rates 1000000,00 Predicted rate of data produc ction at Diamond (TiB/year) 100000,00 10000,00 1000,00 100,00 10,00 1,00 Decreasing Exp. Rate Model Rate Best fit cubic Rate Actual Rate 0,10 janv. 07 janv. 09 janv. 11 janv. 13 janv. 15 janv. 17 janv. 19 janv. 21 janv. 23 Date

Where to next? Proposing to create a rolling programme for computing development based on Moore's Law and typical hardware lifetimes of 5 years Replace most hardware every 5 years Significant cluster and storage purchases every 2-3 years. A significant investment, but we must develop A significant investment, but we must develop computing proactively in parallel with beamline data rates, rather than reactively because of them.

Conclusions We have come from behind in computing but are gradually catching up. Probably one of the most challenging synchrotrons with a large proportion of MX and an ambitious tomography group. Must have long-term proactive approach to keep ahead of beamline requirements.