Data Issues for next generation HPC

Size: px
Start display at page:

Download "Data Issues for next generation HPC"

Transcription

1 Data Issues for next generation HPC Bryan Lawrence National Centre for Atmospheric Science National Centre for Earth Observation Rutherford Appleton Laboratory Caveats: Due to time, discussion is limited to expected use of HPC for climate simulation, but I expect the arguments apply to pretty much the entire gamut of direct numerical smulation of the environment using HPC. Outline: Context & Futures

2 Context

3 FAR:1990 SAR:1995 TAR:2001 AR4:2007 AR5:2013

4 Using Compute Cycles They all end up inputing and producing more data!

5 Simulation Data Deluge 8 PB 4 PB 2 PB BADC Figure courtesy of Gary Strand (NCAR)

6 and let's not leave out EO! Source: ESA GSCB Workshop June 2009

7 and let's not leave out EO! Source: ESA GSCB Workshop June 2009 Crucial component of model evaluation!

8 State of the art: CMIP5: Fifth Coupled Model Intercomparison Project Projections/simulations Perturbation examples Centennial & Longer e.g. Pre-industrial control, historical, RCPs, Paleo Natural forcings only, GHG forcings only, abrupt 4xCO2 runs. Longer integrations Decadal 10 and 30 year hindcasts and predictions Hindcasts without volcanoes Atmosphere only AMIP aqua-planet ~90,000 years simulated in ~60 experiments ~20-30 modelling centres (from around the world) using ~60? major model configurations ~1-2 million output atomic datasets ~10's of petabytes of output, 1 PB at BADC! Why 1 PB at BADC? Example: Just the ocean 3d fields for the decadal experiments = 45 TB We can't have folks downloading that across the Atlantic

9 The pieces of CMIP5 support HARDWARE SOFTWARE COLLABORATION METADATA DEVELOPMENTS USAGE TOOLKITS Data storage International effort Approx. 1,000TB International effort Sub-setting Replication system Faster network Batch processing QC & Versioning systems Describing models, experiments and datasets Servers to deliver and process data MOHC DATA SUPPORT Data handling of MOHC models Checking and QC Connection to tools Interfaces to data Standard format and description for all NERC DATA SUPPORT INTERNATIONAL DATA SUPPORT Harmonisation Data handling of HIGEM and Paleo models Data handling of models Format conversion Connection to tools Checking and QC Checking and QC Connection to tools Re-gridding Format conversion Visualisation Analysis Platform UK Community Engagement with Impacts Community Public Sector, general public and Private Sector access Development of Derived Products (From a BADC perspective) Not just about storage!

10 Futures

11 A range of scales Thus far it has taken roughly five years for evey doubling of climate model resolution, so this would be a pretty aggressive speed of improvement, but recent emphasis on complexity is likely to replaced by an emphasis on resolution for a while...

12 Computing Futures Best available UK climate sys Bottom climate sys (~ tier 2) ca pa ci ty ca pa bi lit y Top 500 #1 Less traditional capacity

13 The Capacity Data Future CMIP CMIP A CMIP B CMIP Atmos Resolution 1 degree 0.5 degree 25 km 10km Atmos Gridpoints 360x x x x , , ,000 5,062, Atmos % Extra Variables Storage (GB/model-mon) Storage Ratio to Comp Ratio to 2012 (data alone) Comp Ratio to 2012 (inc CFL timestep) Atmos # Gridpoints Atmos # Vert Levels Archive Size 1 PB Still a HARD HPC problem 500 x? PB From a data point of view: increased resolution may be less of a problem than more science & bigger ensembles plus wide availability of petascale computing.? likely to be significantly greater 2 Exascale Data

14 The Capability Data Future Not planning for this as a data sharing activity yet! John Shalf, LBNL: May Accessed 24/01/2011

15 We can produce it, and we can store it but can we move it? Moore's Law: CPU power doubles every 18 months Gilder's Law: the total communication bandwidth triples every twelve months (doubling time about 8 months). Experience of RAL Wide Area Network (WAN) is consistent with Gilder's Law, with doubling time a little over 7 months over the last 15 years. Sounds good, but Nielsen's Law: internet bandwith doubles every months. High performance network cards have moved from 1 Gbit/s to 10 Gbit/s in a decade: Local Area Network (LAN) access performance doubling every 21 months. In 1998, the BADC moved to Token Ring (100 Mb/s) to cope with ECMWF data rates In 2010, BADC is seriously trying to exploit 10 Gbit/s cards to cope with CMIP5 data replication and delivery but fighting with the WAN headline speed is not easily achievable. Internationally (and for the Met Office) we still exploit sneakernet (sending personal RAID systems and USB disks via courier). Very staff time intensive, would prefer to use WAN! In 2022 we can expect WAN performance to cope with massive distribution, but LAN limiting massive I/O at any particular location unless we take action!

16 Practicalities of Data Volumes CMIP CMIP6 2017A CMIP6 2017B CMIP degree 0.5 degree 25 km 10km 1 Gbit/s ,289 17, Gbit/s , Gbit/s Atmos Resolution Download times (hours) for 10 years of (all) data from one ensemble member! Clearly users won't be downloading ALL data from ensembles, now or in the future! ExArch Data comparison: must avoid N x N data movements between N data production sites! Prefer N movements to M (<N) processing sites, exploit better algorithms! Data Movement might be practical MxN times, but only with lightpaths in place ( lightpath = dedicate network slice over dark fibre, provides guaranteed bandwidth unlike shared network). Want dynamic lightpaths: drop when not in use.

17 Modelling Post-Processing Infrastructure: UK interim Plan? &3?G Local Analysis Cloud (Users control the software environment but can mount high performance disk WITH DATA ALREADY THERE) Why light paths? For CMIP5, synchronising 1 PB archive at 1 % level implies 10 TB/day movement implies 1 Gbit/s requirement. International Light Paths will cost money!

18 Modelling Post-Processing Infrastructure: UK Long Term Plan? Move BADC to join LHC (successor) Tier1 and European Bioinformatics Institute and to a national big computing facility in Scotland (green power, easy cooling). (Latency will still matter, we can't go offshore. We can improve on lots of things with improved networks, but we can't beat latency. Data still has to be IN the UK.)

19 The Missing Link Importance of metadata: What simulations exist where? Who ran them? Why? Using what model? Configured how? With what output? Cf CMIP5: millions of atomic datasets What quality assessments have been made? Keeping data: evidence (you probably won't be able to rerun five years later!) Without metadata to drive automatic systems we have no show of managing data, and making it available to the research communities. We expect metadata costs (for metadata in a variety of guises) to be very significant!

20 Our Exascale Future SI prefix k kilo Name Power of 10 or 2 Status thousand Count on fingers M mega million Trivial G giga billion Small T tera trillion Real P peta quadrillion Challenging 103 E exa quintillion Aspirational Z zetta sextillion Wacko Y yotta septillion Science fiction Stuart Feldman, Google, 2010 Expect Exabytes of HPC data to be challenging in ten years time! Too easy to say! (bnl)

e-research in support of climate science

e-research in support of climate science e-research in support of climate science Bryan Lawrence Rutherford Appleton Laboratory reporting the efforts of dozens of other folks in major international projects including, but not limited to BADC

More information

The CEDA Archive: Data, Services and Infrastructure

The CEDA Archive: Data, Services and Infrastructure The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team

More information

The Future of ESGF. in the context of ENES Strategy

The Future of ESGF. in the context of ENES Strategy The Future of ESGF in the context of ENES Strategy With a subtext of the important role of IS-ENES2 In addressing solutions to the following question: Two thirds of data written is never read! WHY NOT?

More information

CMIP5 Datenmanagement erste Erfahrungen

CMIP5 Datenmanagement erste Erfahrungen CMIP5 Datenmanagement erste Erfahrungen Dr. Michael Lautenschlager Deutsches Klimarechenzentrum Helmholtz Open Access Webinare zu Forschungsdaten Webinar 18-17.01./28.01.14 CMIP5 Protocol +Timeline Taylor

More information

The Mathematics of Big Data

The Mathematics of Big Data The Mathematics of Big Data Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) Math & Big Data Fall 2017 1 / 10 Introduction We briefly present Big Data and the issues associated with Big Data. Philippe

More information

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products What is big data? Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

More information

40G BASE-T Cabling Infrastructure. Allan Nielsen Standards Manager

40G BASE-T Cabling Infrastructure. Allan Nielsen Standards Manager 40G BASE-T Cabling Infrastructure Allan Nielsen Standards Manager 40GBASE-T Transmission with bi-directional 4x10G nearby cable Switch 40GBASE-T signal spectrum up to 1.600 MHz Alien Crosstalk (AXT) Shield

More information

ExArch, Edinburgh, March 2014

ExArch, Edinburgh, March 2014 ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,

More information

Intro to CMIP, the WHOI CMIP5 community server, and planning for CMIP6

Intro to CMIP, the WHOI CMIP5 community server, and planning for CMIP6 Intro to CMIP, the WHOI CMIP5 community server, and planning for CMIP6 Caroline Ummenhofer, PO Overview - Background on IPCC & CMIP - WHOI CMIP5 server - Available model output - How to access files -

More information

Exaflood Optics 1018

Exaflood Optics 1018 Exaflood Optics 10 18 Good News from US in 2009! Since 2000 US residential bandwidth grew 54X US wireless bandwidth grew 542X Total consumer bandwidth grew 91X Total per capita consumer BW grew 84X Total

More information

Terabit Networking with JASMIN

Terabit Networking with JASMIN Terabit Networking with JASMIN Jonathan Churchill JASMIN Infrastructure Manager Research Infrastructure Group Scientific Computing Department STFC Rutherford Appleton Labs Terabit Networking with JASMIN

More information

ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P.

ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,

More information

CERN and Scientific Computing

CERN and Scientific Computing CERN and Scientific Computing Massimo Lamanna CERN Information Technology Department Experiment Support Group 1960: 26 GeV proton in the 32 cm CERN hydrogen bubble chamber 1960: IBM 709 at the Geneva airport

More information

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners 24th Forum ORAP Cite Scientifique; Lille, France March 26, 2009 Don Middleton National

More information

SC17 - Overview

SC17 - Overview HPSS @ SC17 - Overview High Performance Storage System The value and benefits of the HPSS service offering http://www.hpss-collaboration.org 1 We are storage industry thought leaders HPSS is a development

More information

Grid Computing: dealing with GB/s dataflows

Grid Computing: dealing with GB/s dataflows Grid Computing: dealing with GB/s dataflows Jan Just Keijser, Nikhef janjust@nikhef.nl David Groep, NIKHEF 21 March 2011 Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see http://gridportal.hep.ph.ic.ac.uk/rtm/

More information

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org.

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org. The important role of HPC and data-intensive infrastructure facilities in supporting a diversity of Virtual Research Environments (VREs): working with Climate Clare Richards, Benjamin Evans, Kate Snow,

More information

Data Management Components for a Research Data Archive

Data Management Components for a Research Data Archive Data Management Components for a Research Data Archive Steven Worley and Bob Dattore Scientific Computing Division Computational and Information Systems Laboratory National Center for Atmospheric Research

More information

Introduction to the Mathematics of Big Data. Philippe B. Laval

Introduction to the Mathematics of Big Data. Philippe B. Laval Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2017 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,

More information

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider Andrew Washbrook School of Physics and Astronomy University of Edinburgh Dealing with Data Conference

More information

Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation (ESGF)

Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation (ESGF) Joseph Antony, Andrew Howard, Jason Andrade, Ben Evans, Claire Trenham, Jingbo Wang Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation

More information

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Matsunaga, H. Sakamoto, I. Ueda International Center for Elementary Particle Physics, The University

More information

High-Energy Physics Data-Storage Challenges

High-Energy Physics Data-Storage Challenges High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)

More information

International Climate Network Working Group (ICNWG) Meeting

International Climate Network Working Group (ICNWG) Meeting International Climate Network Working Group (ICNWG) Meeting Eli Dart ESnet Science Engagement Lawrence Berkeley National Laboratory Workshop on Improving Data Mobility & Management for International Climate

More information

Exploiting Weather & Climate Data at Scale (WP4)

Exploiting Weather & Climate Data at Scale (WP4) Exploiting Weather & Climate Data at Scale (WP4) Julian Kunkel 1 Bryan N. Lawrence 2,3 Jakob Luettgau 1 Neil Massey 4 Alessandro Danca 5 Sandro Fiore 5 Huang Hu 6 1 German Climate Computing Center (DKRZ)

More information

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010 Physics Computing at CERN Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010 Location Building 513 (opposite of restaurant no. 2) Building Large building with 2700 m 2 surface for

More information

Opportunities A Realistic Study of Costs Associated

Opportunities A Realistic Study of Costs Associated e-fiscal Summer Workshop Opportunities A Realistic Study of Costs Associated X to Datacenter Installation and Operation in a Research Institute can we do EVEN better? Samos, 3rd July 2012 Jesús Marco de

More information

Some Reflections on Advanced Geocomputations and the Data Deluge

Some Reflections on Advanced Geocomputations and the Data Deluge Some Reflections on Advanced Geocomputations and the Data Deluge J. A. Rod Blais Dept. of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, AB www.ucalgary.ca/~blais

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Introduction to PRECIS

Introduction to PRECIS Introduction to PRECIS Joseph D. Intsiful CGE Hands-on training Workshop on V & A, Asuncion, Paraguay, 14 th 18 th August 2006 Crown copyright Page 1 Contents What, why, who The components of PRECIS Hardware

More information

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc Atos announces the Bull sequana X1000 the first exascale-class supercomputer Jakub Venc The world is changing The world is changing Digital simulation will be the key contributor to overcome 21 st century

More information

Earth Observation, Climate and Space for Smarter Government

Earth Observation, Climate and Space for Smarter Government Earth Observation, Climate and Space for Smarter Government Beth Greenaway, Head of Earth Observation 30 March 2015 http://www.bis.gov.uk/ukspaceagency Overview UK Space Agency EO Importance and priorities

More information

Connecting the e-infrastructure chain

Connecting the e-infrastructure chain Connecting the e-infrastructure chain Internet2 Spring Meeting, Arlington, April 23 rd, 2012 Peter Hinrich & Migiel de Vos Topics - About SURFnet - Motivation: Big data & collaboration - Collaboration

More information

Data Transfers Between LHC Grid Sites Dorian Kcira

Data Transfers Between LHC Grid Sites Dorian Kcira Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference

More information

Nov 27, 1942 Sept 18, Class #2 - Sept 20, 2017

Nov 27, 1942 Sept 18, Class #2 - Sept 20, 2017 Nov 27, 1942 Sept 18, 1970 3510 - Class #2 - Sept 20, 2017 Today s agenda Housekeeping Telecom basics redux Social media reading Some WWW facts Quiz protocols Homework for Wed Sept 27 Don t know a megabit

More information

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic Deep Storage for Exponential Data Nathan Thompson CEO, Spectra Logic HISTORY Partnered with Fujifilm on a variety of projects HQ in Boulder, 35 years of business Customers in 54 countries Spectra builds

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Grid Computing: dealing with GB/s dataflows

Grid Computing: dealing with GB/s dataflows Grid Computing: dealing with GB/s dataflows Jan Just Keijser, Nikhef janjust@nikhef.nl David Groep, NIKHEF 3 May 2012 Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see http://gridportal.hep.ph.ic.ac.uk/rtm/

More information

Modeling groups and Data Center Requirements. Session s Keynote. Sébastien Denvil, CNRS, Institut Pierre Simon Laplace (IPSL)

Modeling groups and Data Center Requirements. Session s Keynote. Sébastien Denvil, CNRS, Institut Pierre Simon Laplace (IPSL) Modeling groups and Data Center Requirements. Session s Keynote. Sébastien Denvil, CNRS, Institut Pierre Simon Laplace (IPSL) Outline Major constraints (requirements' DNA) Modeling center requirements/constraints

More information

Chris Dwan - Bioteam

Chris Dwan - Bioteam Chris Dwan - Bioteam Scientists with production HPC skills Bridging the gap between informatics & IT Vendor & technology agnostic A resource for labs and workgroups that don t have their own supercomputing

More information

InfraStructure for the European Network for Earth System modelling. From «IS-ENES» to IS-ENES2

InfraStructure for the European Network for Earth System modelling. From «IS-ENES» to IS-ENES2 InfraStructure for the European Network for Earth System modelling From «IS-ENES» to IS-ENES2 Sylvie JOUSSAUME, CNRS, Institut Pierre Simon Laplace, Coordinator ENES European Network for Earth System modelling

More information

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium tmruwart@dtc.umn.edu Orientation Who are the lunatics? What are their requirements?

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

UW-ATLAS Experiences with Condor

UW-ATLAS Experiences with Condor UW-ATLAS Experiences with Condor M.Chen, A. Leung, B.Mellado Sau Lan Wu and N.Xu Paradyn / Condor Week, Madison, 05/01/08 Outline Our first success story with Condor - ATLAS production in 2004~2005. CRONUS

More information

CS780: Topics in Computer Graphics

CS780: Topics in Computer Graphics CS780: Topics in Computer Graphics Scalable Graphics/Geometric Algorithms Sung-Eui Yoon ( 윤성의 ) Course URL: http://jupiter.kaist.ac.kr/~sungeui/sga/ About the Instructor Joined KAIST at July this year

More information

How to Cloud for Earth Scientists: An Introduction

How to Cloud for Earth Scientists: An Introduction How to Cloud for Earth Scientists: An Introduction Chris Lynnes, NASA EOSDIS* System Architect *Earth Observing System Data and Information System Outline Cloud Basics What good is cloud computing to an

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

EO Ground Segment Evolution Reflections by

EO Ground Segment Evolution Reflections by EO Ground Segment Evolution Reflections by Interoute Jonathan Brown Marketing Director Workshop 2015, 24 th September 2015 ESA/ESRIN Frascati Interoute, from the ground to the cloud 1. Interoute is the

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

European and international background

European and international background IS-ENES2 General Assembly 11-13th June 2014 Barcelona European and international background Metadata and ES-DOC (including statistical downscaling) Sébastien Denvil, Mark Greenslade, Allyn Treshansky,

More information

Data Serving Climate Simulation Science at the NASA Center for Climate Simulation (NCCS)

Data Serving Climate Simulation Science at the NASA Center for Climate Simulation (NCCS) Data Serving Climate Simulation Science at the NASA Center for Climate Simulation (NCCS) MSST2011, May 24-25, 2011 Ellen Salmon ( Ellen.Salmon@nasa.gov ) High Performance Computing, Code 606.2 Computational

More information

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016 Ran into Miron at a workshop recently.. Talked about data and the challenges of providing

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Compact Muon Solenoid: Cyberinfrastructure Solutions Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Computing Demands CMS must provide computing to handle huge data rates and sizes, and

More information

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc. UK LUG 10 th July 2012 Lustre at Exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Exascale I/O requirements Exascale I/O model 3 Lustre at Exascale - UK LUG 10th July 2012 Exascale I/O

More information

SKA. data processing, storage, distribution. Jean-Marc Denis Head of Strategy, BigData & HPC. Oct. 16th, 2017 Paris. Coyright Atos 2017

SKA. data processing, storage, distribution. Jean-Marc Denis Head of Strategy, BigData & HPC. Oct. 16th, 2017 Paris. Coyright Atos 2017 SKA data processing, storage, distribution Oct. 16th, 2017 Paris Jean-Marc Denis Head of Strategy, BigData & HPC Coyright Atos 2017 What are SKA data and compute needs? Some key numbers Source: Image Swinburn

More information

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011 Physics Computing at CERN Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011 Location (1) Building 513 (opposite of restaurant no. 2) Building 513 (1) Large building with 2700 m 2

More information

High Performance Computing. What is it used for and why?

High Performance Computing. What is it used for and why? High Performance Computing What is it used for and why? Overview What is it used for? Drivers for HPC Examples of usage Why do you need to learn the basics? Hardware layout and structure matters Serial

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

Comparing File (NAS) and Block (SAN) Storage

Comparing File (NAS) and Block (SAN) Storage Comparing File (NAS) and Block (SAN) Storage January 2014 Contents Abstract... 3 Introduction... 3 Network-Attached Storage... 3 Storage Area Network... 4 Networks and Storage... 4 Network Roadmaps...

More information

The EC Presenting a multi-terabyte dataset MWF via ER the web

The EC Presenting a multi-terabyte dataset MWF via ER the web The EC Presenting a multi-terabyte dataset MWF via ER the web Data Management at the BADC Ag Stephens BADC Data Scientist 11 November 2003 Presentation outline An introduction to the BADC. The project

More information

CMIP5 Update. Karl E. Taylor. Program for Climate Model Diagnosis and Intercomparison (PCMDI) Lawrence Livermore National Laboratory

CMIP5 Update. Karl E. Taylor. Program for Climate Model Diagnosis and Intercomparison (PCMDI) Lawrence Livermore National Laboratory CMIP5 Update Karl E. Taylor Program for Climate Model Diagnosis and Intercomparison () Lawrence Livermore National Laboratory Presented to the WCRP Working Group on Coupled Modelling Hamburg, Germany 24

More information

IBM Spectrum Scale IO performance

IBM Spectrum Scale IO performance IBM Spectrum Scale 5.0.0 IO performance Silverton Consulting, Inc. StorInt Briefing 2 Introduction High-performance computing (HPC) and scientific computing are in a constant state of transition. Artificial

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

Kenneth A. Hawick P. D. Coddington H. A. James

Kenneth A. Hawick P. D. Coddington H. A. James Student: Vidar Tulinius Email: vidarot@brandeis.edu Distributed frameworks and parallel algorithms for processing large-scale geographic data Kenneth A. Hawick P. D. Coddington H. A. James Overview Increasing

More information

The Earth System Grid Federation: Delivering globally accessible petascale data for CMIP5

The Earth System Grid Federation: Delivering globally accessible petascale data for CMIP5 Proceedings of the Asia-Pacific Advanced Network 2011 v. 32, p. 121-130. The Earth System Grid Federation: Delivering globally accessible petascale data for CMIP5 Dean N. Williams 1, Bryan N. Lawrence

More information

Data Reference Syntax Governing Standards within Climate Research Data archived in the ESGF

Data Reference Syntax Governing Standards within Climate Research Data archived in the ESGF Data Reference Syntax Governing Standards within Climate Research Data archived in the ESGF Michael Kolax Swedish Meteorological and Hydrological Institute Motivation for a DRS within CMIP5 In CMIP5 the

More information

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA Rees Williams on behalf of A.N.Belikov, D.Boxhoorn, B. Dröge, J.McFarland, A.Tsyganov, E.A. Valentijn University of Groningen, Groningen,

More information

An Overview of Fujitsu s Lustre Based File System

An Overview of Fujitsu s Lustre Based File System An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu

More information

Optimizing Virtualization using Advanced Memory and Storage Technology

Optimizing Virtualization using Advanced Memory and Storage Technology Optimizing Virtualization using Advanced Memory and Storage Technology Speakers: Sylvie Kadivar, PhD, Director, DRAM Strategic Marketing, Samsung Steve Weinger, Director, Flash Marketing, Samsung 1 /?

More information

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison Glossary API Application Programming Interface AR5 IPCC Assessment Report 4 ASCII American Standard Code for Information Interchange BUFR Binary Universal Form for the Representation of meteorological

More information

THOUGHTS ON SDN IN DATA INTENSIVE SCIENCE APPLICATIONS

THOUGHTS ON SDN IN DATA INTENSIVE SCIENCE APPLICATIONS THOUGHTS ON SDN IN DATA INTENSIVE SCIENCE APPLICATIONS Artur Barczyk/Caltech Internet2 Technology Exchange Indianapolis, October 30 th, 2014 October 29, 2014 Artur.Barczyk@cern.ch 1 HEP context - for this

More information

Resiliency at Scale in the Distributed Storage Cloud

Resiliency at Scale in the Distributed Storage Cloud Resiliency at Scale in the Distributed Storage Cloud Alma Riska Advanced Storage Division EMC Corporation In collaboration with many at Cloud Infrastructure Group Outline Wi topic but this talk will focus

More information

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge HPC Growing Pains IT Lessons Learned from the Biomedical Data Deluge John L. Wofford Center for Computational Biology & Bioinformatics Columbia University What is? Internationally recognized biomedical

More information

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice 2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data

More information

N. Marusov, I. Semenov

N. Marusov, I. Semenov GRID TECHNOLOGY FOR CONTROLLED FUSION: CONCEPTION OF THE UNIFIED CYBERSPACE AND ITER DATA MANAGEMENT N. Marusov, I. Semenov Project Center ITER (ITER Russian Domestic Agency N.Marusov@ITERRF.RU) Challenges

More information

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data

More information

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research Storage Platforms with Aspera Overview A growing number of organizations with data-intensive

More information

WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS. Industry-leading IP Performance for 4K, 8K and Beyond

WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS. Industry-leading IP Performance for 4K, 8K and Beyond WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS Industry-leading IP Performance for 4K, 8K and Beyond CONTENTS Introduction... 3 Audience... 3 How Traditional Infrastructure is Failing in Creative Workflows...

More information

Programmable Information Highway (with no Traffic Jams)

Programmable Information Highway (with no Traffic Jams) Programmable Information Highway (with no Traffic Jams) Inder Monga Energy Sciences Network Scientific Networking Division Lawrence Berkeley National Lab Exponential Growth ESnet Accepted Traffic: Jan

More information

JASMIN Petascale storage and terabit networking for environmental science

JASMIN Petascale storage and terabit networking for environmental science JASMIN Petascale storage and terabit networking for environmental science Matt Pritchard Centre for Environmental Data Archival RAL Space Jonathan Churchill Scientific Computing Department STFC Rutherford

More information

Experiences of the Development of the Supercomputers

Experiences of the Development of the Supercomputers Experiences of the Development of the Supercomputers - Earth Simulator and K Computer YOKOKAWA, Mitsuo Kobe University/RIKEN AICS Application Oriented Systems Developed in Japan No.1 systems in TOP500

More information

PLAN-E Workshop Switzerland. Welcome! September 8, 2016

PLAN-E Workshop Switzerland. Welcome! September 8, 2016 PLAN-E Workshop Switzerland Welcome! September 8, 2016 The Swiss National Supercomputing Centre Driving innovation in computational research in Switzerland Michele De Lorenzi (CSCS) PLAN-E September 8,

More information

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations APAN Cloud WG Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations Lift off NCI and Copernicus The National Computational Infrastructure (NCI) in

More information

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation INSPIRE 2010, KRAKOW Dr. Arif Shaon, Dr. Andrew Woolf (e-science, Science and Technology Facilities Council, UK) 3

More information

Bytes and codes. everyday life and Second World War. Mars Lycée du Bois d Amour POITIERS

Bytes and codes. everyday life and Second World War. Mars Lycée du Bois d Amour POITIERS everyday life and Second World War Lycée du Bois d Amour POITIERS Mars 2018 Bytes or bits? Do you know the difference? Bytes or bits? Do you know the difference? Is there a difference? Bytes or bits? bit

More information

Public Resource Distributed Modelling. Dave Stainforth, Oxford University. MISU, Stockholm 8 th March 2006

Public Resource Distributed Modelling. Dave Stainforth, Oxford University. MISU, Stockholm 8 th March 2006 Public Resource Distributed Modelling Dave Stainforth, Oxford University Acknowledgements: Myles Allen, Dave Frame, Carl Christensen, Tolu Aina, Jamie Kettleborough, Mat Collins and many many others. MISU,

More information

CANARIE: Providing Essential Digital Infrastructure for Canada

CANARIE: Providing Essential Digital Infrastructure for Canada CANARIE: Providing Essential Digital Infrastructure for Canada Mark Wolff; CTO April 16, 2014 A Transformation of the Science Paradigm thousands of years ago last few hundred years last few decades today

More information

The LHC Computing Grid

The LHC Computing Grid The LHC Computing Grid Visit of Finnish IT Centre for Science CSC Board Members Finland Tuesday 19 th May 2009 Frédéric Hemmer IT Department Head The LHC and Detectors Outline Computing Challenges Current

More information

To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha

To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha Background Over 40 Data Centers (DCs) on EC2, Azure, Google Cloud A geographically denser set of DCs across

More information

De BiG Grid e-infrastructuur digitaal onderzoek verbonden

De BiG Grid e-infrastructuur digitaal onderzoek verbonden Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see http://gridportal.hep.ph.ic.ac.uk/rtm/ De BiG Grid e-infrastructuur digitaal onderzoek verbonden David Groep, Nikhef KennisKring Amsterdam

More information

Grid Computing at the IIHE

Grid Computing at the IIHE BNC 2016 Grid Computing at the IIHE The Interuniversity Institute for High Energies S. Amary, F. Blekman, A. Boukil, O. Devroede, S. Gérard, A. Ouchene, R. Rougny, S. Rugovac, P. Vanlaer, R. Vandenbroucke

More information

C3S Data Portal: Setting the scene

C3S Data Portal: Setting the scene C3S Data Portal: Setting the scene Baudouin Raoult Baudouin.raoult@ecmwf.int Funded by the European Union Implemented by Evaluation & QC function from European commission e.g.,fp7 Space call Selected set

More information

Cat Herding. Why It s Time for a Millennial Approach to Storage. Cloud Expo East Western Digital Corporation All rights reserved 01/25/2016

Cat Herding. Why It s Time for a Millennial Approach to Storage. Cloud Expo East Western Digital Corporation All rights reserved 01/25/2016 Cat Herding Why It s Time for a Millennial Approach to Storage Cloud Expo East 1 A Time and Place for Everything The PC Movement of the 1980 s put pressure on mainframe storage architects In 1987 the RAID

More information

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan

Management Information Systems OUTLINE OBJECTIVES. Information Systems: Computer Hardware. Dr. Shankar Sundaresan Management Information Systems Information Systems: Computer Hardware Dr. Shankar Sundaresan (Adapted from Introduction to IS, Rainer and Turban) OUTLINE Introduction The Central Processing Unit Computer

More information

EUDAT & SeaDataCloud

EUDAT & SeaDataCloud EUDAT & SeaDataCloud SeaDataCloud Kick-off meeting Damien Lecarpentier CSC-IT Center for Science www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-infrastructures.

More information

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title Scaling the Earth System Grid to 100Gbps Networks Permalink https://escholarship.org/uc/item/80n7w3tw Author Balman, Mehmet

More information