Tackling tomorrow s computing challenges today at CERN. Maria Girone CERN openlab CTO

Similar documents
Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

CERN s Business Computing

Overview. About CERN 2 / 11

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

Big Data Analytics and the LHC

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Batch Services at CERN: Status and Future Evolution

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

1. Introduction. Outline

CSCS CERN videoconference CFD applications

Grid Computing a new tool for science

Storage and I/O requirements of the LHC experiments

The CMS Computing Model

ATLAS Experiment and GCE

Stephen J. Gowdy (CERN) 12 th September 2012 XLDB Conference FINDING THE HIGGS IN THE HAYSTACK(S)

The LHC Computing Grid

Distributed e-infrastructures for data intensive science

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Using the In-Memory Columnar Store to Perform Real-Time Analysis of CERN Data. Maaike Limper Emil Pilecki Manuel Martín Márquez

Visita delegazione ditte italiane

CC-IN2P3: A High Performance Data Center for Research

The LHC computing model and its evolution. Dr Bob Jones CERN

Virtualizing a Batch. University Grid Center

Summary of the LHC Computing Review

IT Challenges and Initiatives in Scientific Research

Grid Computing Activities at KIT

Travelling securely on the Grid to the origin of the Universe

IEPSAS-Kosice: experiences in running LCG site

Data Reconstruction in Modern Particle Physics

Grid Computing: dealing with GB/s dataflows

Computing at the Large Hadron Collider. Frank Würthwein. Professor of Physics University of California San Diego November 15th, 2013

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Towards a Strategy for Data Sciences at UW

PoS(High-pT physics09)036

Detector Control LHC

LHCb Computing Resources: 2018 requests and preview of 2019 requests

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

Improving Packet Processing Performance of a Memory- Bounded Application

The EuroHPC strategic initiative

Europe and its Open Science Cloud: the Italian perspective. Luciano Gaido Plan-E meeting, Poznan, April

International Cooperation in High Energy Physics. Barry Barish Caltech 30-Oct-06

Grid Computing: dealing with GB/s dataflows

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

High-Energy Physics Data-Storage Challenges

The LHC Computing Grid

Volunteer Computing at CERN

The GAP project: GPU applications for High Level Trigger and Medical Imaging

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Storage Resource Sharing with CASTOR.

Summary of Data Management Principles

Data handling and processing at the LHC experiments

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

We invented the Web. 20 years later we got Drupal.

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

The JINR Tier1 Site Simulation for Research and Development Purposes

Philippe Laurens, Michigan State University, for USATLAS. Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan

Accelerating Throughput from the LHC to the World

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

EGI: Linking digital resources across Eastern Europe for European science and innovation

Grid Computing at the IIHE

Smart Data for. Industrial Control Systems. CERN Technical Workshop

Technical Case Study CERN the European Organization for Nuclear Research

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

First LHCb measurement with data from the LHC Run 2

Data Transfers Between LHC Grid Sites Dorian Kcira

CMS Computing Model with Focus on German Tier1 Activities

Experience of the WLCG data management system from the first two years of the LHC data taking

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

T0-T1-T2 networking. Vancouver, 31 August 2009 LHCOPN T0-T1-T2 Working Group

New Approach to Unstructured Data

Data services for LHC computing

LHCb Computing Resources: 2019 requests and reassessment of 2018 requests

The Grid: Processing the Data from the World s Largest Scientific Machine

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

HIGH ENERGY PHYSICS ON THE OSG. Brian Bockelman CCL Workshop, 2016

Precision Timing in High Pile-Up and Time-Based Vertex Reconstruction

Clouds in High Energy Physics

First Experience with LCG. Board of Sponsors 3 rd April 2009

CERN Lustre Evaluation

Green Supercomputing

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

UW-ATLAS Experiences with Condor

CouchDB-based system for data management in a Grid environment Implementation and Experience

CERN and Scientific Computing

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Gigabyte Bandwidth Enables Global Co-Laboratories

Bringing OpenStack to the Enterprise. An enterprise-class solution ensures you get the required performance, reliability, and security

Transient Compute ARC as Cloud Front-End

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

Transcription:

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

CERN is the European Laboratory for Particle Physics. CERN openlab CTO

The laboratory straddles the Franco- Swiss border near Geneva.

Member states A world-wide endevour Associate member states in the pre-stage to membership Associate member states Observers 22 members 8 associates 3 observers Budget (2017) 1100 MCHF Cooperation agreements It has 22 member states and supports a global community of 15,000 researchers.

Looking for Antimatter Understanding the very first moments of our Universe after the Big Bang Understanding Dark Matter These researchers are probing the fundamental structure of the Universe.

qadvance the frontiers of knowledge E.g. the secrets of the Big Bang what was the matter like within the first moments of the Universe s existence? qdevelop new technologies for accelerators and detectors Information technology - the Web and the GRID Medicine - diagnosis and therapy qtrain scientists and engineers of tomorrow qunite people from different countries and cultures CERN s mission: research, technology, education, and collaboration.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

The LHC is the world s largest and most powerful particle accelerator.

CMS ALICE ATLAS LHCb The LHC is the world s largest and most powerful particle accelerator.

CMS ALICE ATLAS LHCb It is built around 100 m underground and has a circumference of 27 km.

CMS ALICE ATLAS LHCb The particles are accelerated to close to the speed of light.

The FASTEST RACETRACK on the Planet The Most Powerful MAGNETS The Most SOPHISTICATED DETECTORS ever built The Highest VACUUM HOTTEST spots in the galaxy COLDER TEMPERATURES than outer space The LHC is a machine of records!

What may come next.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

The detectors are like gigantic digital cameras built in cathedral-sized caverns.

Experiments are run by collaborations of scientists from institutes all over the world.

ATLAS CMS 46m long, 25m diameter weights 7 000 tonnes 100 million electronic channels, 3 000 km of cables 22m long, 15m diameter weights 14 000 tonnes Most powerful superconducting solenoid ever built Two general-purpose detectors cross-confirm discoveries, such as the Higgs boson.

ALICE LHCb Studies the «Quark Gluon Plasma», state of matter which existed moments after the Big Bang. Studies the behaviour difference between the b quark and the anti-b quark to explain the matter-antimatter asymmetry in the Universe. ALICE and LHCb experiments have detectors specialised on studying specific phenomena.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Collisions generate particles that decay in complex ways into even more particles.

Up to about 1 billion particle collisions can take place every second.

Data generated 40 million times per second PB/s 100,000 selections per second TB/s 1,000 selections per second GB/s This can generate up to a petabyte of data per second. Filtering the data in real time, selecting potentially interesting events (trigger).

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

The CERN data centre processes hundreds of petabytes of data every year.

MEYRIN CENTRE (CH) 300,0000 processors cores 180 PB on disk 230 PB on tape WIGNER CENTRE (H) 100,0000 processors cores 100 PB on disk The two centres are connected by three 100 Gb/s fibre-optic links. CERN s data centre in Meyrin is the heart of the laboratory s computing infrastructure.

MEYRIN CENTRE (CH) 300,0000 processors cores 180 PB on disk 230 PB on tape WIGNER CENTRE (H) 100,0000 processors cores 100 PB on disk The two centres are connected by three 100 Gb/s fibre-optic links. The Wigner data centre in Budapest serves as an extension to the one in Meyrin.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Physicists must sift through the 30-50 PBs produced annually by the LHC experiments.

~40 MHz ~ PB/s Online Real time L1 Trigger (HW) ~100 khz HL Trigger (SW) ~1 khz WLCG Raw Data ~ 1-10 GB/s Offline - Asynchronous Physicists must sift through the 30-50 PBs produced annually by the LHC experiments.

Tier-0 (CERN and Hungary): data recording, reconstruction and distribution Tier-1: permanent storage, re-processing, analysis Tier-2: Simulation, end-user analysis The Worldwide LHC Computing Grid integrates computer centres worldwide to combine computing and storage resources into a single infrastructure accessible by all LHC physicists The WLCG gives thousands of physicists across the globe near real-time access.

With 170 computing centres in 42 countries, the WLCG is the grid that never sleeps!

~1M Cores CPU delivered ~170 sites, 42 countries ~1M CPU cores ~1EB of storage 3PB/day > 2 million jobs/day 10-100 Gb/s links 340 Gb/s transatlantic 3PB moved per day The size of WLCG.

Making hundreds of petabytes of data accessible globally to scientists is one the biggest challenges of WLCG WLCG Compute Storage Compute cache Storage 1 to 10 Tb links Commercial Cloud Storage cache Compute HPC cache Data Organization, Management and Access in WLCG

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

The LHC has been designed to follow a carefully set out programme of upgrades.

RUN 3 ALICE & LHCb upgrades RUN 4 ATLAS & CMS upgrades The planned upgrades will greatly increase the scientific reach.

Rate of new physics is 1 event in 10 12 Selecting a new physics event is like choosing 1 grain of sand in 20 volley ball courts More collisions help physicists to observe rare processes and study with greater precision.

CMS: event from 2017 with 78 reconstructed vertices CMS: event with 78 reconstructed vertices ATLAS: simulation for HL-LHC with 200 vertices The HL-LHC will come online around 2026. More collisions and more complex data.

LHCb and ALICE will move offline processing closer to the online data collection chain Performing processing and data analysis in near real-time Solutions under investigation New HLT farms for Run3 Flexible and efficient system with ambitious PUE ratio Courtesy of Automation Data Center Facilities The ALICE and LHCb experiments will increase their data acceptance rates for Run 3.

By Run4, the detectors will become more granular and more radiation hard. PLACE Reconstructing more particles with more granular detectors will be computationally more expensive. The ATLAS and CMS experiments will be significantly upgraded for the HL-LHC.

CPU Resources [khs06*1000] 100 80 60 40 20 ATLAS Preliminary Resource needs (2017 Computing model) Flat budget model (+20%/year) Run 2 Run 3 Run 4 Disk Storage [PBytes] 5000 4000 3000 2000 1000 ATLAS Preliminary Resource needs (2017 Computing model) Flat budget model (+15%/year) Run 2 Run 3 Run 4 2018 2020 2022 2024 2026 2028 2018 2020 2022 2024 2026 2028 Year Year Using current techniques, required computing capacity increases 50-100 times.

CPU Resources [khs06*1000] 100 80 60 40 20 ATLAS Preliminary Resource needs (2017 Computing model) Flat budget model (+20%/year) Run 2 Run 3 Run 4 Disk Storage [PBytes] 5000 4000 3000 2000 1000 ATLAS Preliminary Resource needs (2017 Computing model) Flat budget model (+15%/year) Run 2 Run 3 Run 4 2018 2020 2022 2024 2026 2028 2018 2020 2022 2024 2026 2028 Year Year Data storage needs are expected to be in the order of Exabytes by this time.

CPU Resources [khs06*1000] 100 80 60 40 20 ATLAS Preliminary Resource needs (2017 Computing model) Flat budget model (+20%/year) Run 2 Run 3 Run 4 Disk Storage [PBytes] 5000 4000 3000 2000 1000 ATLAS Preliminary Factor 4 Factor 8 Resource needs (2017 Computing model) Flat budget model (+15%/year) Run 2 Run 3 Run 4 2018 2020 2022 2024 2026 2028 Year 2018 2020 2022 2024 2026 2028 Year It is vital to explore new technologies and methodologies.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Technology Evolution and Improvements Software innovation, New Architectures, Techniques and Methods Improvements in hardware performance and capacity. Innovation and revolutionary thinking. Closing the resource gap in the next decade requires close collaboration with industry.

MANAGEMENT JOINT R&D INNOVATION & KNOWLEDGE TRANSFER COMMUNICATION EDUCATION CERN openlab is a unique science-industry partnership, fostering research and innovation.

Scale out capacity with public clouds, HPC, new architectures COMPUTING CHALLENGES Increase data centre performance with hardware accelerators (FPGAs, GPUs,..) optimized software New techniques with Machine Learning, Deep Learning, Advanced Data Analytics Three main areas of research and development.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Data centre technologies and infrastructures.

Faced with a resource gap of this magnitude: 1. fully exploit available hardware; 2. expand dynamically to new computing environments Data centre technologies and infrastructures.

CERN is one of the early adopters and largest contributors to OpenStack 90% of the resources are provided through a private cloud Allows for flexible and dynamic deployment 320k cores Moving to containers for even more flexibility Current investigations within CERN openlab End Users Volunteer Computin g HTCondor Experiment Pilot Factories Public Cloud (LSF) Bare Metal and HPC CI/CD Containers APIs CLIs GUIs IT & Experiment Services VMs OpenStack Resource Provisioning (>1 physical data centre) Layered, virtualized services provide flexibility and efficiency.

300k Cores 80k Cores Experiments have demonstrated that it is possible to elastically and dynamically expand production resources to commercial clouds. Large-scale tests with commercial clouds.

Joint procurement of R&D cloud services for scientific research.

HPC are significant resources and are being tested by the experiments Optimized for highly parallel applications Cores by processing type 500k MC simu Cores by resource type 500k HPC ATLAS reached more than 200k traditional x86 HPC cores for simulation workflows MC reco Grid Data Derivation HLT, Cloud Smooth Tier-0 running on 23k cores All experiments are exploring the use of heterogeneous HPC architectures CERN will partner with EU-PRACE optimizing the use of HPC resources for Demonstrations with large-scale, dedicated HPC resources, too.

DEEP-EST: Dynamical Exascale Entry Platform - Extreme Scale Technologies CERN is a partner of DEEP-EST, a blueprint project for heterogeneous HPC systems.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Computing Performance and Software Exploiting heterogeneous resources.

HEP has a vast investment in software Significant effort to make efficient multi-threaded and vectorized CPU code Accelerated computing devices (GPUs, FPGAs) offer a different model Complexity of heterogeneous architectures Simultaneously exploring lower performance but lower power alternatives like ARM C Leggett, LBNL Software optimization can gain factors in performance.

2018 2008 The landscape is shifting at all levels.

Higher data rates require more selective triggering and faster reconstruction LHCb is investigating FPGAs and GPUs to allow reconstruction of 5GB/s of events in real time. CMS is porting heavy offline tasks to real-time processing for HL-LHC Integrate GPUs in the HLT farm to give highquality reconstruction in 100 msec latency (as opposed to tens of sec) 5 TB/s 0.1-0.2 TB/s ~5 GB/s DETECTOR READOUT HLT1 PARTIAL RECO HLT2 FULL RECO 5% FULL 85% TURBO & real-time analysis 10% CALIB Exploiting co-processors for software-based filtering and real-time reconstruction.

CERN openlab is engaging in QC with industry Can substantially speed-up training of deep learning and combinatorial searches Well suited for fitting, minimization, optimization Can directly describe basic interactions as well as lattice QCD calculations Quantum Computing is also on the horizon.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Machine learning and advanced data analytics.

Experiment and accelerator operations have similar challenges to industrial applications A multitude of Industrial Control Systems Detectors and accelerators infrastructure health needs to be monitored Cooling & Ventilation VACUUM Quality of produced data needs to be validated Resource usage needs to be optimized Cryogenics GAS Working with industry partners to deploy similar techniques and automation Electric Grid LHC Circuit, QPS, WIC, PIC, Monitoring, automation and anomaly detection.

With current software and computer an event like HL-LHC takes 10s of seconds IDENTIFICATION Examine the detector hit information and use 3D image recognition techniques to identify objects Recognize physics objects from learned patterns INPUT IMAGES Might dramatically increase the speed for reconstruction Exploring image recognition for reconstruction and object identification.

Main R&D areas Adapting the existing code to new computing architectures Replacing complex algorithms with deeplearning approaches (FAST SIMULATION) Looking at adversarial networks to improve speed without giving up accuracy of simulated events One network attempts to simulate events that match a data distribution While a second network tries to distinguish data and simulation Simulation is one of the most resourceintensive computing applications.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

CERN is collaborating with other communities who share similar computing challenges.

The Square Kilometre Array (SKA) observatory s two telescopes will enable astronomers to study the sky in unprecedented detail. First phase will be operational in the mid 2020s; observatory will function for 50 years. South Africa Western Australia Joint exascale data-storage and processing challenge between HL-LHC and SKA.

CERN-MEDICIS: production of innovative isotopes for medical research Accelerator design for future hadron therapy facilities Medical imaging Dosimetry Computing & simulation for health applications CDERN MEDICIS BioDynamo Accelerating innovation and knowledge transfer to medical applications.

Tackling tomorrow s computing challenges today at CERN CERN openlab CTO

Tackling tomorrow s computing challenges today at CERN CERN has been pushing the boundaries of knowledge and technology for more than 60 years. The next phase of the programme will include unprecedented computing challenges. We look forward to tackling these challenges through open collaboration and innovation with industry and other scientific communities. CERN openlab CTO

Tackling tomorrow s computing challenges today at CERN «Magic is not happening at CERN, magic is being explained at CERN.» Tom Hanks Thank you! Credit for the slide layout to Andrew Purcell, CERN CERN openlab CTO

Tackling tomorrow s computing challenges today at CERN Home page: home.cern home.cern/studentseducators @CERN openlab.cern/ whitepaper Home page: openlab.cern @CERNopenlab openlab.cern/ education CERN openlab CTO

Tackling tomorrow s computing challenges today at CERN Backup Slides CERN openlab CTO

Introduction of NVRAM provides much faster access to data and applications in memory High Performance and long duration SSD improves the access on large datasets Low cost and high capacity SSD could revolutionize long term storage Larger spinning disk brings us to exabytes of storage Storage has many more layers and improvements in access.

Raw data: Was a detector element hit? How much energy? What time? Reconstructed data: Momentum of tracks (4- vectors) Origin of collision Energy in clusters (jets) Type of Particle Calibration information n 150 Million sensors deliver data 40 Million times per second The data at the LHC.