Accelerating Throughput from the LHC to the World

Similar documents
Nikhef a Journey in Physics and Data Processing

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Grid Computing: dealing with GB/s dataflows

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

The LHC Computing Grid

First Experience with LCG. Board of Sponsors 3 rd April 2009

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Grid Computing: dealing with GB/s dataflows

Summary of the LHC Computing Review

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Distributed e-infrastructures for data intensive science

Grid Computing at the IIHE

HTC/HPC Russia-EC. V. Ilyin NRC Kurchatov Institite Moscow State University

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

Travelling securely on the Grid to the origin of the Universe

The CMS Computing Model

Europe and its Open Science Cloud: the Italian perspective. Luciano Gaido Plan-E meeting, Poznan, April

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Batch Services at CERN: Status and Future Evolution

Support for multiple virtual organizations in the Romanian LCG Federation

CERN s Business Computing

De BiG Grid e-infrastructuur digitaal onderzoek verbonden

1. Introduction. Outline

The LHC computing model and its evolution. Dr Bob Jones CERN

CSCS CERN videoconference CFD applications

Experience of the WLCG data management system from the first two years of the LHC data taking

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP

LHCb Computing Resources: 2019 requests and reassessment of 2018 requests

Towards Network Awareness in LHC Computing

CouchDB-based system for data management in a Grid environment Implementation and Experience

150 million sensors deliver data. 40 million times per second

Austrian Federated WLCG Tier-2

Update of the Computing Models of the WLCG and the LHC Experiments

Stephen J. Gowdy (CERN) 12 th September 2012 XLDB Conference FINDING THE HIGGS IN THE HAYSTACK(S)

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011

Andrea Sciabà CERN, Switzerland

High-Energy Physics Data-Storage Challenges

RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

National R&E Networks: Engines for innovation in research

Overview. About CERN 2 / 11

Big Data Analytics and the LHC

Virtualizing a Batch. University Grid Center

Grid Computing Activities at KIT

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

GÉANT Mission and Services

CC-IN2P3: A High Performance Data Center for Research

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Grid: data delen op wereldschaal

Tackling tomorrow s computing challenges today at CERN. Maria Girone CERN openlab CTO

Grid Computing a new tool for science

The Grid: Processing the Data from the World s Largest Scientific Machine

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Figure 1: cstcdie Grid Site architecture

LHCb Computing Resources: 2018 requests and preview of 2019 requests

Operating the Distributed NDGF Tier-1

LHCb Computing Resource usage in 2017

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

e-research Infrastructures for e-science Axel Berg SARA national HPC & e-science support center RAMIRI, June 15, 2011

The EU DataGrid Testbed

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Grid and Cloud Activities in KISTI

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

IT Challenges and Initiatives in Scientific Research

Improving Packet Processing Performance of a Memory- Bounded Application

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

CERN and Scientific Computing

Computing Resources Scrutiny Group

10 Gbit/s Challenge inside the Openlab framework

Storage and I/O requirements of the LHC experiments

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

The European DataGRID Production Testbed

Belle & Belle II. Takanori Hara (KEK) 9 June, 2015 DPHEP Collaboration CERN

GÉANT Services Supporting International Networking and Collaboration

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

Distributing storage of LHC data - in the nordic countries

WLCG and Grid Computing Summer 2011 Part1: WLCG Markus Schulz. IT Grid Technology Group, CERN WLCG

Status of KISTI Tier2 Center for ALICE

Data oriented job submission scheme for the PHENIX user analysis in CCJ

The grid for LHC Data Analysis

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

The ATLAS Distributed Analysis System

THOUGHTS ON SDN IN DATA INTENSIVE SCIENCE APPLICATIONS

An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1

Data Transfers Between LHC Grid Sites Dorian Kcira

Transcription:

Accelerating Throughput from the LHC to the World David Groep David Groep Nikhef PDP Advanced Computing for Research v5 Ignatius 2017

12.5 MByte/event 120 TByte/s and now what?

Kans Higgs deeltje: 1 op de 1.000.000.000.000 bostingen - Dit is equivalent met zoeken van 1 persoon op 1000 wereldpopulaties - Oftewel één naald in 20 miljoen hooibergen 8 september 2015 Orientatie Nikhef

50 PiB/year primary data

Detector to doctor... 40 miljoen / seconde Analyse van botsingen door promovendi and processing Data distributie met GRID computers Trigger systeem selecteert 600 Hz ~ 1 GB/s data Collisions characterised by event signature: - electronen - muonen - jets -... 8 september 2015 Orientatie Nikhef 5

Building the Infrastructure in a federated way September 2016: - 63 MoU s - 167 sites; 42 countries CPU: 3.8 M HepSpec06 If today s fastest cores: ~ 350,000 cors Actually many more (up to 5 yr old cores) Disk 310 PB Tape 390 PB 8 October 2016 Ian Bird 6

~300 resource centres ~250 communities Federated infrastructure

Global collaboration in a secure way Collaboration is people as well as (or even more than) systems A global identity federation for e-infra and cyber research infrastructures Common baseline assurance (trust) requirements Persistent and globally unique needs a global scope so we built the Interoperable Global Trust Federation over 80 member Authorities Including your GÉANT Trusted Certificate Service

Building the infrastructure for the LHC data From hierarchical data distribution to a full mesh and dynamic data placement LHCOne graphic: lhcone.net Amsterdam/NIKHEF-SARA

Connecting Science through Lambdas

Network built around application data flow Need to work together! without our SURFsara peering, SURFnet gets flooded and: you really want many of your own peerings

12

Statistics Dutch National e-infrastructure coordinated by BiG Grid HTC and storage platform services 3 core operational sites: SURFsara, Nikhef, RUG-CIT 25+ PiB tape, 10+ PiB disk, 12000+ CPU cores @Nikhef ~ 5500 cores and 3.5 PiB focus on large/many-core systems > 45 install flavours (service types) and a bunch of one-off systems

Right:: NIKHEF-ELPROD facility, Friday, Dec 9 th, 2016 Left: annual usage distribution 2013-2014 Shared infrastructure, efficient infrastructure! >98% utilisation, >90% efficiency NL: WeNMR EGI biomedical LIGO/Virgo gravitational waves LHC Alice LHC Atlas LHCb

Waiting will not help you any more Helge Meinhard, Bernd Panzer-Steindel, Technology Evolution, https://indico.cern.ch/event/555063/contributions/2285842/

For (informed) fun & testing some random one-off systems plofkip.nikhef.nl

For (informed) fun & testing some random one-off systems CO2-cooled Intel CPUs @6.2GHz

From SC04, CCRC08, STEP09,.. to today Global transfer rates increased to > 40 GB/s Acquisition:10 PB/mo (~x2 for physics data) STEP09 : Jamie Shiers, CERN IT/GS; Throughput 2016: WLCG Workshop 2016, Ian Bird, CERN

and tomorrow?! 1000 Data estimates for 1st year of HL-LHC (PB) 250000 CPU Needs for 1st Year of HL-LHC (khs06) ALICE ATLAS CMS LHCb 900 800 ALICE ATLAS CMS LHCb 200000 700 600 150000 500 400 100000 300 200 50000 100 0 Raw Derived 0 CPU (HS06) Data: Raw 2016: 50 PB 2027: 600 PB Derived (1 copy): 2016: 80 PB 2027: 900 PB CPU: x60 from 2016 Technology at ~20%/year will bring x6-10 in 10-11 years

Interconnecting compute & storage data shall not be a bottleneck 5500 cores process together ~ 16 GByte/s of data sustained or ~ 10 GByte/jobslot/hr are bursty when many tasks start together and in parallel we have to serve the world

Infrastructure for research: balancing network, CPU, and disk CPU and disk both expensive, yet idling CPUs are even costlier architecture and performance matching averts any single bottleneck but requires knowledge of application (data flow) behaviour data pre-placement (local access), mesh data federation (WAN access) This is why e.g. your USB drive does not cut it and neither does your home NAS box however much I like my home system using just 15 Watt idle and offering 16TB for just 915

Getting more bytes through? Power 8: more PCI lanes & higher clock should give more throughput if all the bits fit together Only way to find out is by trying it! joint experiment with Nikhef and SURFsara on comparing IO throughput between x86 & P8 HGST: 480 TByte gross capacity/4ru yet more is needed RAID card are now a performance bottleneck JBOD changes CPU-disk ratio closer integration of networking to get >100Gbps

Fun, but not the solution to single-core performance Collaboration of Intel and Nikhef PDP & MT (Krista de Roo) CO2 Inside