Scientific Computing at SLAC

Similar documents
Computing. Richard P. Mount, SLAC. Director, SLAC Computing Services Assistant Director, Research Division. DOE Review

High-Energy Physics Data-Storage Challenges

Scientific Computing at SLAC. Chuck Boeheim Asst Director: Scientific Computing and Computing Services

Introduction. Amber Boehnlein July, Sci Comp SLAC Scientific Computing Workshop 1

Customer Success Story Los Alamos National Laboratory

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

BaBar Computing: Technologies and Costs

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Existing Tools in HEP and Particle Astrophysics

Storage Optimization with Oracle Database 11g

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scientific Computing at SLAC. Amber Boehnlein

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration

HP visoko-performantna OLTP rješenja

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

Table 9. ASCI Data Storage Requirements

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

Network Design Considerations for Grid Computing

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

ICN for Cloud Networking. Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory

Network Support for Data Intensive Science

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect

A scalable storage element and its usage in HEP

Data storage services at KEK/CRC -- status and plan

Clustering and Reclustering HEP Data in Object Databases

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Optimizing the Data Center with an End to End Solutions Approach

Double Rewards of Porting Scientific Applications to the Intel MIC Architecture

10 Gbit/s Challenge inside the Openlab framework

AMD Opteron Processors In the Cloud

Gen-Z Memory-Driven Computing

Benchmarking AMD64 and EMT64

Benefits of full TCP/IP offload in the NFS

Lustre architecture for Riccardo Veraldi for the LCLS IT Team

Veritas NetBackup on Cisco UCS S3260 Storage Server

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

Storage and Memory Infrastructure to Support 5G Applications. Tom Coughlin President, Coughlin Associates

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

TITLE. the IT Landscape

Storage Resource Sharing with CASTOR.

Real Parallel Computers

HIGH PERFORMANCE COMPUTING FROM SUN

Benoit DELAUNAY Benoit DELAUNAY 1

Solaris Engineered Systems

Open storage architecture for private Oracle database clouds

Organizational Update: December 2015

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Managing Petabytes of data with irods. Jean-Yves Nief CC-IN2P3 France

The INFN Tier1. 1. INFN-CNAF, Italy

Summary of the LHC Computing Review

Overview. About CERN 2 / 11

SAP HANA. Jake Klein/ SVP SAP HANA June, 2013

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Storage Systems. Storage Systems

Data Sheet FUJITSU Server PRIMERGY CX2550 M1 Dual Socket Server Node

SAP High-Performance Analytic Appliance on the Cisco Unified Computing System

Configuring a Single Oracle ZFS Storage Appliance into an InfiniBand Fabric with Multiple Oracle Exadata Machines

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

BlueDBM: An Appliance for Big Data Analytics*

HPE ProLiant Gen10. Franz Weberberger Presales Consultant Server

Look Up! Your Future is in the Cloud

IEPSAS-Kosice: experiences in running LCG site

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

BlueGene/L. Computer Science, University of Warwick. Source: IBM

A Flexible Data Warehouse Architecture

Cluster Setup and Distributed File System

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Top Trends in DBMS & DW

Isilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division

Data Transfers Between LHC Grid Sites Dorian Kcira

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory

Storage and I/O requirements of the LHC experiments

Andy Kowalski Ian Bird, Bryan Hess

Experimental Computing. Frank Porter System Manager: Juan Barayoga

ECE 2162 Intro & Trends. Jun Yang Fall 2009

Virtualizing a Batch. University Grid Center

Data Intensive Science Impact on Networks

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 >

Building the Storage Internet. Dispersed Storage Overview March 2008

Deep Learning Performance and Cost Evaluation

ALCF Argonne Leadership Computing Facility

Achieving Horizontal Scalability. Alain Houf Sales Engineer

High Performance Ethernet for Grid & Cluster Applications. Adam Filby Systems Engineer, EMEA

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

A New NSF TeraGrid Resource for Data-Intensive Science

Emerging Technologies for HPC Storage

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Solid Access Technologies, LLC

Transcription:

Scientific Computing at SLAC Richard P. Mount Director: Scientific Computing and Computing Services DOE Review June 15, 2005

Scientific Computing The relationship between Science and the components of Scientific Computing Application Sciences Issues addressable with computing Computing techniques Computing architectures Computing hardware High-energy and Particle-Astro Physics, Accelerator Science, Photon Science Particle interactions with matter, Electromagnetic structures, Huge volumes of data, Image processing PDE Solving, Algorithmic geometry, Visualization, Meshes, Object databases, Scalable file systems Single system image, Low-latency clusters, Throughputoriented clusters, Scalable storage Processors, I/O devices, Mass-storage hardware, Randomaccess hardware, Networks and Interconnects 2

Scientific Computing: SLAC s goals for leadership in Scientific Computing Application Sciences SLAC + Stanford Science Issues addressable with computing Computing techniques Computing architectures Computing hardware The Scien ce of Scientific Computing Computing for Data- Intensive Science Collaboration with Stanford and Industry 3

Scientific Computing: Current SLAC leadership and recent achievements in Scientific Computing PDE solving for complex electromagnetic structures Huge-memory systems for data analysis Scalable data management SLAC + Stanford Science GEANT4 photon/particle interaction in complex structures (in collaboration with CERN) World s largest database Internet2 Land-Speed Record; SC2004 Bandwidth Challenge 4

SLAC Scientific Computing Drivers BaBar (data-taking ends December 2008) The world s most data-driven experiment Data analysis challenges until the end of the decade KIPAC From cosmological modeling to petabyte data analysis Photon Science at SSRL and LCLS Ultrafast Science, modeling and data analysis Accelerator Science Modeling electromagnetic structures (PDE solvers in a demanding application) The Broader US HEP Program (aka LHC) Contributes to the orientation of SLAC Scientific Computing R&D 5

SLAC-BaBar Computing Fabric Client Client Client Client Client Client 1700 dual CPU Linux 400 single CPU Sun/Solaris IP Network (Cisco) HEP-specific ROOT software (Xrootd) + Objectivity/DB object database Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server 120 dual/quad CPU Sun/Solaris ~400 TB Sun FibreChannel RAID arrays IP Network (Cisco) HPSS + SLAC enhancements to ROOT and Objectivity server code Tape Tape Tape Tape Tape 25 dual CPU Server Server Server Server Server Sun/Solaris 40 STK 9940B 6 STK 9840A 6 STK Powderhorn over 1 PB of data 6

BaBar Computing at SLAC Farm Processors (5 generations, 3700 CPUs) Servers (the majority of the complexity) Disk storage (2+ generations, 400+ TB) Tape storage (40 Drives) Network backplane (~26 large switches) External network 7

Rackable Intel P4 Farm (bought in 2003/4) 384 machines, 2 per rack unit, dual 2.6 GHz CPU 8

Disks and Servers 1.6 TB usable per tray, ~160 trays bought 2003/4 9

Tape Drives 40 STK 9940B (200 GB) Drives 6 STK 9840 (20 GB) Drives 6 STK Silos (capacity 30,000 tapes) 10

BaBar Farm-Server Network ~26 Cisco 65xx Switches Farm/Server Network 11

SLAC External Network (June 14, 2005) 622 Mbits/ to ESNet 1000 Mbits/s to Internet 2 ~300 Mbits/s average traffic Two 10 Gbits/s wavelengths to ESNET, UltraScience Net/NLR coming in July 12

Research Areas (1) (Funded by DOE-HEP and DOE SciDAC and DOE-MICS) Huge-memory systems for data analysis (SCCS Systems group and BaBar) Expected major growth area (more later) Scalable Data-Intensive Systems: (SCCS Systems and Physics Experiment Support groups) The world s largest database (OK not really a database any more) How to maintain performance with data volumes growing like Moore s Law? How to improve performance by factors of 10, 100, 1000,? (intelligence plus brute force) Robustness, load balancing, troubleshootability in 1000 10000-box systems Astronomical data analysis on a petabyte scale (in collaboration with KIPAC) 13

Research Areas (2) (Funded by DOE-HEP and DOE SciDAC and DOE MICS) Grids and Security: (SCCS Physics Experiment Support. Systems and Security groups) PPDG: Building the US HEP Grid OSG; Security in an open scientific environment; Accounting, monitoring, troubleshooting and robustness. Network Research and Stunts: (SCCS Network group Les Cottrell et al.) Land-speed record and other trophies Internet Monitoring and Prediction: (SCCS Network group) IEPM: Internet End-to-End Performance Monitoring (~5 years) SLAC is the/a top user of ESNet and the/a top user of Internet2. (Fermilab doesn t do so badly either) INCITE: Edge-based Traffic Processing and Service Inference for High- Performance Networks 14

Research Areas (3) (Funded by DOE-HEP and DOE SciDAC and DOE MICS) GEANT4: Simulation of particle interactions in million to billion-element geometries: (SCCS Physics Experiment Support Group M. Asai, D. Wright, T. Koi, J. Perl ) BaBar, GLAST, LCD LHC program Space Medical PDE Solving for complex electromagnetic structures: (Kwok s advanced Computing Department + SCCS clusters) 15

Growing Competences Parallel Computing (MPI ) Driven by KIPAC (Tom Abel) and ACD (Kwok Ko) SCCS competence in parallel computing (= Alf Wachsmann currently) MPI clusters and SGI SSI system Visualization Driven by KIPAC and ACD SCCS competence is currently experimental-hep focused (WIRED, HEPREP ) (A polite way of saying that growth is needed) 16

A Leadership-Class Facility for Data-Intensive Science Richard P. Mount Director, SLAC Computing Services Assistant Director, SLAC Research Division Washington DC, April 13, 2004

Technology Issues in Data Access Latency Speed/Bandwidth (Cost) (Reliabilty) 18

Latency and Speed Random Access Random-Access Storage Performance 1000 100 10 Retreival Rate Mbytes/s 1 0.1 0.01 0.001 0.0001 0.00001 0.000001 PC2100 WD200GB STK9940B 0.0000001 0.00000001 0.000000001 0 1 2 3 4 5 6 7 8 9 10 log10 (Obect Size Bytes) 19

Latency and Speed Random Access Historical Trends in Storage Performance 1000 100 10 1 Retrieval Rate MBytes/s 0.1 0.01 0.001 0.0001 0.00001 0.000001 PC2100 WD200GB STK9940B RAM 10 years ago Disk 10 years ago Tape 10 years ago 0.0000001 0.00000001 0.000000001 0 1 2 3 4 5 6 7 8 9 10 log10 (Object Size Bytes) 20

The Strategy There is significant commercial interest in an architecture including data-cache memory But: from interest to delivery will take 3-4 years And: applications will take time to adapt not just codes, but their whole approach to computing, to exploit the new architecture Hence: two phases 1. Development phase (years 1,2,3) Commodity hardware taken to its limits BaBar as principal user, adapting existing data-access software to exploit the configuration BaBar/SLAC contribution to hardware and manpower Publicize results Encourage other users Begin collaboration with industry 2. Production-Class Facility (year 3 onwards) Optimized architecture Strong industrial collaboration Wide applicability 21

PetaCache The Team David Leith, Richard Mount, PIs Randy Melen, Project Leader Bill Weeks, performance testing Andy Hanushevsky, xrootd Systems group members Network group members BaBar (Stephen Gowdy) 22

Development Machine Design Principles Attractive to scientists Big enough data-cache capacity to promise revolutionary benefits 1000 or more processors Processor to (any) data-cache memory latency < 100 µs Aggregate bandwidth to data-cache memory > 10 times that to a similar sized disk cache Data-cache memory should be 3% to 10% of the working set (approximately 10 to 30 terabytes for BaBar) Cost effective, but acceptably reliable Constructed from carefully selected commodity components Cost no greater than (cost of commodity DRAM) + 50% 23

Development Machine Design Choices Intel/AMD server mainboards with 4 or more ECC dimm slots per processor 2 Gbyte dimms (4 Gbyte too expensive this year) 64-bit operating system and processor Favors Solaris and AMD Opteron Large (500+ port) switch fabric Large IP switches are most cost-effective Use of ($10M+) BaBar disk/tape infrastructure, augmented for any non-babar use 24

Development Machine Deployment Proposed Year 1 Memory Interconnect Switch Fabric Cisco/Extreme/Foundry 650 Nodes, each 2 CPU, 16 GB memory Storage Interconnect Switch Fabric Cisco/Extreme/Foundry > 100 Disk Servers Provided by BaBar 25

Development Machine Deployment Currently Funded Cisco Switch Clients up to 2000 Nodes, each 2 CPU, 2 GB memory Linux Data-Servers 64-128 Nodes, each Sun V20z, 2 Opteron CPU, 16 GB memory Up to 2TB total Memory Solaris Cisco Switches PetaCache MICS Funding Existing HEP-Funded BaBar Systems 26

Latency (1) Ideal Memory Client Application 27

Latency (2) Current reality Client Application Data Server Data-Server-Client OS OS OS File System TCP Stack TCP Stack Disk NIC NIC Network Switches 28

Latency (3) Immediately Practical Goal Memory Client Application Data Server Data-Server-Client OS OS OS File System TCP Stack TCP Stack Disk NIC NIC Network Switches 29

Latency Measurements (Client and Server on the same switch) 250.00 200.00 Latency (microseconds) 150.00 100.00 Server xrootd overhead Server xrootd CPU Client xroot overhead Client xroot CPU TCP stack, NIC, switching Min transmission time 50.00 0.00 100 600 1100 1600 2100 2600 3100 3600 Block Size (bytes) 4100 4600 5100 5600 6100 6600 7100 7600 30 8100

Development Machine Deployment Likely Low-Risk Next Step Cisco Switch Clients up to 2000 Nodes, each 2 CPU, 2 GB memory Linux Data-Servers 80 Nodes, each 8 Opteron CPU, 128 GB memory Up to 10TB total Memory Solaris Cisco Switch 31

Development Machine Complementary Higher Risk Approach Add Flash-Memory based subsystems Quarter to half the price of DRAM Minimal power and heat Persistent 25 µs chip-level latency (but hundreds of µs latency in consumer devices) Block-level access (~1kbyte) Rated life of 10,000 writes for two-bit-per-cell devices (NB BaBar writes FibreChannel disks < 100 times in their entire service life) Exploring necessary hardware/firmware/software development with PantaSys Inc. 32

Object-Serving Software AMS and Xrootd (Andy Hanushevsky/SLAC) Optimized for read-only access Make 1000s of servers transparent to user code Load balancing Automatic staging from tape Failure recovery Can allow BaBar to start getting benefit from a new data-access architecture within months without changes to user code Minimizes impact of hundreds of separate address spaces in the data-cache memory 33

Summary: SLAC s goals for leadership in Scientific Computing Application Sciences SLAC + Stanford Science Issues addressable with computing Computing techniques Computing architectures Computing hardware The Scien ce of Scientific Computing Computing for Data- Intensive Science Collaboration with Stanford and Industry 34