The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections

Size: px
Start display at page:

Download "The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections"

Transcription

1 ESSI The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections Ben Evans 1, Lesley Wyborn 1, Tim Pugh 2, Chris Allen 1, Joseph Antony 1, Kashif Gohar 1, David Porter 1, Jon Smillie 1, Claire Trenham 1, Jingbo Wang 1, Irina Bastrakova 3, Alex Ip 3, Gavin Bell 4 1 ANU, 2 Bureau of Meteorology, 3 Geoscience Australia, 4 The 6 th Column Project (Second part of this talk is in next ESSI

2 1/25 High Performance Data (HPD) - data that is carefully prepared, standardised and structured so that it can be used in Data-Intensive Science on HPC (Evans, ISESS 2015, Springer) HPC turning compute into IO-bound problems HPD turning IO-bound into ontology + semantic problems What are the HPC and HPD drivers? How do you build environments for this infrastructure that is easy for users to do science?

3 Top 500 Super Computer list since /25 Current NCI Next NCI Fast-and-flexible data access to structured data is required The needs to be a balance between processing power and ability to access data (data scaling) The focus is for ondemand direct access to large data sources enabling High performance analytics and analysis tools directly on that content

4 Elephant Flows Place Great Demands on Networks 3/25 Physical pipe that leaks water at rate of.0046% by volume. Result % of water transferred. Network pipe that drops packets at rate of.0046%. Result 100% of data transferred, slowly, at <<5% optimal speed. essentially fixed determined by speed of light With proper engineering, we can minimize packet loss. Assumptions: 10Gbps TCP flow, 80ms RTT. See Eli Dart, Lauren Rotman, Brian Tierney, Mary Hester, and Jason Zurawski. The Science DMZ: A Network Design Pattern for Data-Intensive Science. In Proceedings of the IEEE/ACM Annual SuperComputing Conference (SC13), Denver CO, 2013.

5 4/25 Computational and Cloud Platforms Raijin: 57,472 cores (Intel Xeon Sandy Bridge technology, 2.6 GHz) in 3592 compute nodes; 160 TBytes (approx.) of main memory; Infiniband FDR interconnect; and 7 PBytes (approx.) of usable fast filesystem (for short-term scratch space). 1.5 MW power; 100 tonnes of water in cooling Partner Cloud Same generation of technology as raijin (Intel Xeon Sandy Bridge technology, 2.6 GHz) but only 1500 cores; Infiniband FDR interconnect; Collaborative platform for services and The platform for hosting non-batch services NCI Nectar Cloud Same generation as partner cloud Non-managed environment Weak integration

6 NCI Cloud 5/25 Lustre Per-Tenant public IP assignments (CIDR boundaries typically /29) NFS NFS FDR IB FDR IB FDR IB FDR IB FDR IB FDR IB SSD SSD SSD SSD SSD SSD OpenStack private IP (flat network*) - quota managed

7 NCI s integrated high-performance environment 6/25 Internet To second data centre NCI data movers 10 GigE /g/data 56Gb FDR IB Fabric Cloud Raijin Login + Data movers Raijin HPC Compute Raijin 56Gb FDR IB Fabric Massdata (tape) Persistent global parallel filesystem Raijin high-speed filesystem Cache 1.0PB, Tape 20PB /g/data1 7.4 PB /g/data PB /g/data3 9 PB /short 7.6PB /home, /system, /images, /apps

8 10+ PB of Data for Interdisciplinary Science 7/25 CMIP5 3PB Earth Observ. 2 PB Marine Videos 10 TB Astronomy (Optical) 200 TB Atmosphere 2.4 PB Water Ocean 1.5 PB Weather 340 TB Bathy, DEM 100 TB Geophysics 300 TB BOM GA CSIRO ANU Other National International

9 8/25 National Environment Research Data Collections (NERDC) 1. Climate/ESS Model Assets and Data Products 2. Earth and Marine Observations and Data Products 3. Geoscience Collections 4. Terrestrial Ecosystems Collections 5. Water Management and Hydrology Collections Data Collections CMIP5, CORDEX ACCESS products LANDSAT, MODIS, VIIRS, AVHRR, INSAR, MERIS Digital Elevation, Bathymetry, Onshore Geophysics Seasonal Climate Bureau of Meteorology Observations Bureau of Meteorology Ocean-Marine Terrestrial Ecosystem Reanalysis products Approx. Capacity ~3 Pbytes 2.4 Pbytes 1.5 Pbytes 700 Tbytes 700 Tbytes 350 Tbytes 350 Tbytes 290 Tbytes 100 Tbytes

10 9/25 Internationally sourced Satellite Data (USGS, NASA, JAXA, ESA, ) Reanalysis (ECMWF, NCEP, NCAR, ) Climate Data (CMIP5, AMIP, GeoMIP, CORDEX, ) Ocean Modelling (Earth Simulator, NOAA, GFDL, ) These will only increase as we depend on more data, and some will be replicated. How can we better keep this in sync, versioned, and back-referenced for the supplier? Organise long-tail data that calibrates and integrates with the big data. How should we manage this data, versioned, and easily attribute supplier (researcher? Collab? Uni? Agency?)

11 Some Data Challenges 10/25 Data Formats Standardize data formats time to convert legacy and proprietary ones Appropriately normalise the data data models and conventions Adopt HPC-enabled libraries that abstracts storage Expose all attributes for search not just collection-level search, not just datasets, all data attributes What are the handles we need to access the data? Provide more programmatic interfaces and link up data and compute resources More server side processing Add the semantic meaning to the data Create useful datasets (in the programming context) from data collections Is it scientifically appropriate for a data service to aggregate/interpolate? What unique/persistent identifiers do we need? DOI is only part of the story. Versioning is important. Born linked data and maintaining graph infrastructure

12 Regularising High Performance Data using HDF5 11/25 Compilers & Tools Fortran, C, C++ Python, R, MatLab, IDL Ferret, CDO,NCL, NCO, GDL,GDAL, GrADS,GRASS,QGIS Globe Caritas Open Nav Surface Metadata Layer netcdf-cf HDF-EOS5 ISO 19115, RIF-CS, DCAT etc Library Layer 1 NetCDF-4 Library libgdal [FITS] Airborne Geophysics Line data [SEG-Y] BAG Library Layer 2 HDF5 MPI-enabled HDF5 Serial Lustre Other Storage (options)

13 Regularising High Performance Data using HDF5 including Data Services 12/25 Compilers & Tools Fortran, C, C++ Python, R, MatLab, IDL Ferret, CDO,NCL, NCO, GDL,GDAL, GrADS,GRASS,QGIS Globe Caritas Open Nav Surface Services (expose data model+sema ntics) Metadata Layer OpenDAP OGC WMS OGC WCS OGC WFS netcdf-cf HDF-EOS5 ISO 19115, RIF-CS, DCAT etc OGC WPS OGC SOS Fast whole-oflibrary catalogue Library Layer 1 NetCDF-4 Library libgdal [FITS] Airborne Geophysics Line data [SEG-Y] BAG Library Layer 2 HDF5 MPI-enabled HDF5 Serial Lustre Other Storage (options)

14 Finding data and services 13/25 GeoNetwork catalogue Supercomputer access Virtual lab DAP, OGC, Services Lucene database /g/data1 /g/data2 Trialing Elastic Search

15 Prototype to Production - anti- Mine craft 14/25 Virtual Labs: Separating Researcher from Software builders Cloud is an enabler, but: don t make researchers become full system admins. save developers from being operational Project lifecycle and preparing success Perspiration Productivity Proj1:Start Proj1:End Proj2-4:Start Proj2-4:End

16 Prototype to Production - anti- Mine craft 15/25 VL Managers VL Managers Headspace hours Developers Developer VL Mgr. Developer? Development Phase in a project Poorly executed Reasonably executed Well executed

17 Prototype to Production - anti- Mine craft 16/25 VL Managers VL Managers Changed Scope adopted broadly Headspace hours Developers Developer VL Mgr Developer Development Phase in a project Poorly executed Reasonably executed Well executed

18 Virtual Laboratory driven software patterns 17/25 Basic OS functions Common Modules Bespoke Services Special config choices Super Software Stack NCI Stack 1 NCI Env Stack Analytics Stack Vis Stack Workflow X Gridftp P2P 2xStack1 Modify Stack1 Modify Stack 2 Take Stacks from Upstream And use as Bundles

19 Transition from developer, to prototype, to DevOps 18/25 Step 1: Development Get template for development What is special, separate out what is common Reuse other software stacks where possible Step 2: Prototype Deploy in an isolated tenant of a cloud Determine dependencies. Test cases to demonstrate correctly functioning. Step 3: Sustainability Pull repo into operational tenant Prepare bundle for integration with rest of framework Hand back cleaned bundle Establish DevOps process

20 19/25 DevOps approach to building and operating environments Virtual Laboratory Operational Bundle - Git controlled - pull model - continuous integration testing NCI Core Bundles Community1 repo Community2 repo

21 Advantages 20/25 Separates roles and responsibilities - from gatekeeper to DevOps management: Specialist on package VL managers system admin Architecture to Platform flexible with technology change makes handover/maintenance easier Both Test/Dev/Ops and patches/rollback become BAU Sharable bundles Can tag release of software stacks Precondition for trusted software stacks Provenance - Scientific / gov policy scrutiny

22 A snapshot of layered bundles to build complex VLs 21/25

23 Easy analysis environments 22/25 Increasing use of ipython Notebooks VDI - Easy In-situ environment using virtual analysis desktops.

24 VDI cont 23/25

25 NCI Petascale Data-Intensive Science Platform 24/25 Data Services THREDDS Server-side analysis and visualization VDI: Cloud scale user desktops on data 10PB+ Research Data Web-time analytics software

26 Summary: Progress toward Major Milestones 25/25 Interdisciplinary Science To publish, catalogue and access data and software for enhancing interdisciplinary, big data-intensive (HPD) science and with interoperable data services and protocols. Integrity of Science Managed services to capture a workflow s process as a comparable, traceable output. Ease-of-access to data and software for enhanced workflow development and repeatable science which can be conducted with less effort or an acceleration of outputs. Integrity of Data The data repository services to ensure data integrity, provenance records, universal identifiers, repeatable data discovery and access from workflows or interactive users.

Implementing a Data Quality Strategy to simplify access to data

Implementing a Data Quality Strategy to simplify access to data IN43D-07 AGU Fall Meeting 2016 Implementing a Quality Strategy to simplify access to data Kelsey Druken, Claire Trenham, Ben Evans, Clare Richards, Jingbo Wang, & Lesley Wyborn National Computational Infrastructure,

More information

Implementing a Data Quality Strategy to simplify access to data

Implementing a Data Quality Strategy to simplify access to data Implementing a Quality Strategy to simplify access to data Kelsey Druken Implementing a Quality Strategy to simplify access to data Kelsey Druken, Claire Trenham, Lesley Wyborn, Ben Evans National Computational

More information

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org.

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org. The important role of HPC and data-intensive infrastructure facilities in supporting a diversity of Virtual Research Environments (VREs): working with Climate Clare Richards, Benjamin Evans, Kate Snow,

More information

Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation (ESGF)

Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation (ESGF) Joseph Antony, Andrew Howard, Jason Andrade, Ben Evans, Claire Trenham, Jingbo Wang Production Petascale Climate Data Replication at NCI Lustre and our engagement with the Earth Systems Grid Federation

More information

The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future

The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future AGU Fall Meeting 2016 IN31-G The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future Clare Richards, Ben Evans, Lesley Wyborn, Jingbo

More information

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison Glossary API Application Programming Interface AR5 IPCC Assessment Report 4 ASCII American Standard Code for Information Interchange BUFR Binary Universal Form for the Representation of meteorological

More information

HPC Data in the Cloud

HPC Data in the Cloud HPC Data in the Cloud Simon Fowler @NCInews Agenda The Problem What we re doing now How can we do it better? Can we do it even better in the future? Conclusions Not being discussed: Performance 2 The Problem

More information

CSIRO and the Open Data Cube

CSIRO and the Open Data Cube CSIRO and the Open Data Cube Dr Robert Woodcock, Matt Paget, Peter Wang, Alex Held CSIRO Overview The challenge The Earth Observation Data Deluge Integrated science needs Data volume, rate of growth and

More information

Portfolio of Services. NATIONAL COMPUTATIONAL Portfolio INFRASTRUCTURE

Portfolio of Services. NATIONAL COMPUTATIONAL Portfolio INFRASTRUCTURE Portfolio of Services NATIONAL COMPUTATIONAL Portfolio INFRASTRUCTURE of Services 1 National Computational Infrastructure The Australian National University 143 Ward Road Acton ACT 2601 T +61 2 6125 9800

More information

Making data access easier with OPeNDAP. James Gallapher (OPeNDAP TM ) Duan Beckett (BoM) Kate Snow (NCI) Robert Davy (CSIRO) Adrian Burton (ARDC)

Making data access easier with OPeNDAP. James Gallapher (OPeNDAP TM ) Duan Beckett (BoM) Kate Snow (NCI) Robert Davy (CSIRO) Adrian Burton (ARDC) Making data access easier with OPeNDAP James Gallapher (OPeNDAP TM ) Duan Beckett (BoM) Kate Snow (NCI) Robert Davy (CSIRO) Adrian Burton (ARDC) Outline Introduction and trajectory (James Gallapher) OPeNDAP

More information

Scaling Weather Climate and Environmental Science Applications, and experiences with Intel Knights Landing

Scaling Weather Climate and Environmental Science Applications, and experiences with Intel Knights Landing Scaling Weather Climate and Environmental Science Applications, and experiences with Intel Knights Landing Ben Evans, Dale Roberts 17 th Workshop on High Performance Computing in Meteorology @NCInews NCI

More information

High Performance Data Efficient Interoperability for Scientific Data

High Performance Data Efficient Interoperability for Scientific Data High Performance Data Efficient Interoperability for Scientific Data Alex Ip 1, Andrew Turner 1, Dr. David Lescinsky 1 1 Geoscience Australia, Canberra, Australia Problem: Legacy Data Formats holding us

More information

The CEDA Archive: Data, Services and Infrastructure

The CEDA Archive: Data, Services and Infrastructure The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team

More information

HDF Product Designer: A tool for building HDF5 containers with granule metadata

HDF Product Designer: A tool for building HDF5 containers with granule metadata The HDF Group HDF Product Designer: A tool for building HDF5 containers with granule metadata Lindsay Powers Aleksandar Jelenak, Joe Lee, Ted Habermann The HDF Group Data Producer s Conundrum 2 HDF Features

More information

eresearch Collaboration across the Pacific:

eresearch Collaboration across the Pacific: eresearch Collaboration across the Pacific: Marine Systems and Australian Marine Science Craig Johnson University of Tasmania Outline Introduce the Australian Ocean Network Possibilities for trans-pacific

More information

IMOS/AODN ocean portal: tools for data delivery. Roger Proctor, Peter Blain, Sebastien Mancini IMOS

IMOS/AODN ocean portal: tools for data delivery. Roger Proctor, Peter Blain, Sebastien Mancini IMOS IMOS/AODN ocean portal: tools for data delivery Roger Proctor, Peter Blain, Sebastien Mancini IMOS Data from IMOS: The six Nodes Bluewater and Climate Node open ocean focus Five Regional Nodes continental

More information

Illinois Proposal Considerations Greg Bauer

Illinois Proposal Considerations Greg Bauer - 2016 Greg Bauer Support model Blue Waters provides traditional Partner Consulting as part of its User Services. Standard service requests for assistance with porting, debugging, allocation issues, and

More information

escience in the Cloud Dan Fay Director Earth, Energy and Environment

escience in the Cloud Dan Fay Director Earth, Energy and Environment escience in the Cloud Dan Fay Director Earth, Energy and Environment dan.fay@microsoft.com New ways to analyze and communicate data EOS Article: Mountain Hydrology, Snow Color, and the Fourth Paradigm

More information

Unidata and data-proximate analysis and visualization in the cloud

Unidata and data-proximate analysis and visualization in the cloud Unidata and data-proximate analysis and visualization in the cloud Mohan Ramamurthy and Many Unidata Staff 1 June 2017 Modeling in the Cloud Workshop Unidata: A program of the community, by the community,

More information

COMP4300/COMP8300 Parallel Systems

COMP4300/COMP8300 Parallel Systems COMP4300/COMP8300 Parallel Systems Alistair Rendell and Joseph Antony Research School of Computer Science Australian National University Concept and Rationale The idea Split your program into bits that

More information

GSKY: A scalable, distributed geospatial data-server

GSKY: A scalable, distributed geospatial data-server GSKY: A scalable, distributed geospatial data-server Pablo R. Larraondo, Sean Pringle, Joseph Antony, Ben Evans pablo.larraondo@anu.edu.au National Computational Infrastructure, Australian National University

More information

Your cloud solution for EO Data access and processing

Your cloud solution for EO Data access and processing powered by Your cloud solution for EO Data access and processing Stanisław Dałek VP - CloudFerro 2 About CREODIAS The platform In 2017 European Space Agency, acting on behalf of the European Commission,

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

What is Dell EMC Cloud for Microsoft Azure Stack?

What is Dell EMC Cloud for Microsoft Azure Stack? What is Dell EMC Cloud for Microsoft Azure Stack? Karsten Bott @azurestack_guy Advisory Cloud Platform Specialist AzureStack GLOBAL SPONSORS Why Hybrid Cloud? The New Digital Customer Rising and continuously

More information

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016 National Aeronautics and Space Administration Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures 13 November 2016 Carrie Spear (carrie.e.spear@nasa.gov) HPC Architect/Contractor

More information

Application Performance on IME

Application Performance on IME Application Performance on IME Toine Beckers, DDN Marco Grossi, ICHEC Burst Buffer Designs Introduce fast buffer layer Layer between memory and persistent storage Pre-stage application data Buffer writes

More information

ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P.

ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,

More information

EUDAT & SeaDataCloud

EUDAT & SeaDataCloud EUDAT & SeaDataCloud SeaDataCloud Kick-off meeting Damien Lecarpentier CSC-IT Center for Science www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-infrastructures.

More information

European Marine Data Exchange

European Marine Data Exchange European Marine Data Exchange By Dick M.A. Schaap MARIS (NL) EU SeaDataNet Technical Coordinator EU EMODnet Ingestion Coordinator Noordzeedagen 2018 - October 2018 Acquisition of ocean and marine data

More information

Index Introduction Setting up an account Searching and accessing Download Advanced features

Index Introduction Setting up an account Searching and accessing Download Advanced features ESGF Earth System Grid Federation Tutorial Index Introduction Setting up an account Searching and accessing Download Advanced features Index Introduction IT Challenges of Climate Change Research ESGF Introduction

More information

THE ENVIRONMENTAL OBSERVATION WEB AND ITS SERVICE APPLICATIONS WITHIN THE FUTURE INTERNET Project introduction and technical foundations (I)

THE ENVIRONMENTAL OBSERVATION WEB AND ITS SERVICE APPLICATIONS WITHIN THE FUTURE INTERNET Project introduction and technical foundations (I) ENVIROfying the Future Internet THE ENVIRONMENTAL OBSERVATION WEB AND ITS SERVICE APPLICATIONS WITHIN THE FUTURE INTERNET Project introduction and technical foundations (I) INSPIRE Conference Firenze,

More information

Emerging Technologies for HPC Storage

Emerging Technologies for HPC Storage Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional

More information

Catalog-driven, Reproducible Workflows for Ocean Science

Catalog-driven, Reproducible Workflows for Ocean Science Catalog-driven, Reproducible Workflows for Ocean Science Rich Signell, USGS, Woods Hole, MA, USA Filipe Fernandes, Centro Universidade Monte Serrat, Santos, Brazil. 2015 Boston Light Swim, Aug 15, 7:00am

More information

Data near processing support for climate data analysis. Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ)

Data near processing support for climate data analysis. Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ) Data near processing support for climate data analysis Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ) Overview Background / Motivation Climate community data infrastructure Data

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

AWIPS Technology Infusion Darien Davis NOAA/OAR Forecast Systems Laboratory Systems Development Division April 12, 2005

AWIPS Technology Infusion Darien Davis NOAA/OAR Forecast Systems Laboratory Systems Development Division April 12, 2005 AWIPS Technology Infusion Darien Davis NOAA/OAR Forecast Systems Laboratory Systems Development Division Plans for AWIPS Next Generation 1 What s a nice lab like you, doing in a place like this? Plans

More information

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group The Common Framework for Earth Observation Data US Group on Earth Observations Data Management Working Group Agenda USGEO and BEDI background Concise summary of recommended CFEOD standards today Full document

More information

NCEP HPC Transition. 15 th ECMWF Workshop on the Use of HPC in Meteorology. Allan Darling. Deputy Director, NCEP Central Operations

NCEP HPC Transition. 15 th ECMWF Workshop on the Use of HPC in Meteorology. Allan Darling. Deputy Director, NCEP Central Operations NCEP HPC Transition 15 th ECMWF Workshop on the Use of HPC Allan Darling Deputy Director, NCEP Central Operations WCOSS NOAA Weather and Climate Operational Supercomputing System CURRENT OPERATIONAL CHALLENGE

More information

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research Dr Paul Calleja Director of Research Computing University of Cambridge Global leader in science & technology

More information

Cumulus Services Working Group. Dan Pilone SE TIM / August 2017

Cumulus Services Working Group. Dan Pilone SE TIM / August 2017 Cumulus Services Working Group Dan Pilone dan@element84.com SE TIM / August 2017 2 Reminder: Why are we doing this? 3 Background: Motivation for Cloud Growth of Mission Data & Processing: Projected rapid

More information

Python: Working with Multidimensional Scientific Data. Nawajish Noman Deng Ding

Python: Working with Multidimensional Scientific Data. Nawajish Noman Deng Ding Python: Working with Multidimensional Scientific Data Nawajish Noman Deng Ding Outline Scientific Multidimensional Data Ingest and Data Management Analysis and Visualization Extending Analytical Capabilities

More information

ExArch, Edinburgh, March 2014

ExArch, Edinburgh, March 2014 ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

The Many Facets of THREDDS Thematic Real-time Environmental Distributed Data Services

The Many Facets of THREDDS Thematic Real-time Environmental Distributed Data Services The Many Facets of THREDDS Thematic Real-time Environmental Distributed Data Services For March 2007 Unidata Policy Committee Meeting Ben Domenico 1 Motivation From the Unidata 2003 proposal: utilizing

More information

THE GEOSS PLATFORM TOWARDS A BIG EO DATA SYSTEM LINKING GLOBAL USERS AND DATA PROVIDERS

THE GEOSS PLATFORM TOWARDS A BIG EO DATA SYSTEM LINKING GLOBAL USERS AND DATA PROVIDERS THE PLATFORM TOWARDS A BIG EO DATA SYSTEM LINKING GLOBAL USERS AND DATA PROVIDERS J. Van Bemmelen (1), P. De Salvo (2), M. Santoro (3), P. Mazzetti (3), G. Colangeli (1), S. Nativi (4) (1) European Space

More information

Bruce Wright, John Ward, Malcolm Field, Met Office, United Kingdom

Bruce Wright, John Ward, Malcolm Field, Met Office, United Kingdom The Met Office s Logical Store Bruce Wright, John Ward, Malcolm Field, Met Office, United Kingdom Background are the lifeblood of the Met Office. However, over time, the organic, un-governed growth of

More information

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017 A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council Perth, July 31-Aug 01, 2017 http://levlafayette.com Necessary and Sufficient Definitions High Performance Computing: High

More information

Lidar Radar Open Software Environment LROSE and the Python ARM Radar Toolkit Py-ART

Lidar Radar Open Software Environment LROSE and the Python ARM Radar Toolkit Py-ART Lidar Radar Open Software Environment LROSE and the Python ARM Radar Toolkit Py-ART Joe VanAndel and Mike Dixon Earth Observing Laboratory (EOL) National Center for Atmospheric Research (NCAR) Scott Collis

More information

OPeNDAP: Accessing HYCOM (and other data) remotely

OPeNDAP: Accessing HYCOM (and other data) remotely OPeNDAP: Accessing HYCOM (and other data) remotely Presented at The HYCOM NOPP GODAE Meeting By Peter Cornillon OPeNDAP Inc., Narragansett, RI 02882 7 December 2005 8/25/05 HYCOM NOPP GODAE 1 Acknowledgements

More information

Distributed Online Data Access and Analysis

Distributed Online Data Access and Analysis Distributed Online Data Access and Analysis Ruixin Yang George Mason University Slides from SIESIP Partners and from NOMADS PI, Glenn K. Rutledge of US NCDC on NOMADS SIESIP: Seasonal-to-Interannual Earth

More information

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014 InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment TOP500 Supercomputers, June 2014 TOP500 Performance Trends 38% CAGR 78% CAGR Explosive high-performance

More information

Standards and business models transformations

Standards and business models transformations Standards and business models transformations Inspire Conference 2017 by Jean Michel Zigna, with support of Elisabeth Lambert, Tarek Habib, Tony Jolibois and Sylvain Marty Collecte Localisation Satellite

More information

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016 My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016 CIMENT is the computing center of the University of Grenoble CIMENT computing platforms 132Tflops

More information

C3S Data Portal: Setting the scene

C3S Data Portal: Setting the scene C3S Data Portal: Setting the scene Baudouin Raoult Baudouin.raoult@ecmwf.int Funded by the European Union Implemented by Evaluation & QC function from European commission e.g.,fp7 Space call Selected set

More information

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori

More information

Pangeo. A community-driven effort for Big Data geoscience

Pangeo. A community-driven effort for Big Data geoscience Pangeo A community-driven effort for Big Data geoscience !2 What would you like to have and why? Pangeo s vision for scientific computing in the big-data era Pangeo s Website pangeo-data.org !3 Hello!

More information

Scientific and Multidimensional Raster Support in ArcGIS

Scientific and Multidimensional Raster Support in ArcGIS Scientific and Multidimensional Raster Support in ArcGIS Sudhir Raj Shrestha sshrestha@esri.com Brief breakdown Scientific Multidimensional data Ingesting Scientific MultiDim Data in ArcGIS Ingesting and

More information

Accelerate OpenStack* Together. * OpenStack is a registered trademark of the OpenStack Foundation

Accelerate OpenStack* Together. * OpenStack is a registered trademark of the OpenStack Foundation Accelerate OpenStack* Together * OpenStack is a registered trademark of the OpenStack Foundation Considerations to Build a Production OpenStack Cloud Ruchi Bhargava, Intel IT Shuquan Huang, Intel IT Kai

More information

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau Leveraging metadata standards in ArcGIS to support Interoperability David Danko and Aleta Vienneau Leveraging Metadata Standards in ArcGIS for Interoperability Why metadata and metadata standards? Overview

More information

Co-ReSyF Hands-on sessions

Co-ReSyF Hands-on sessions This project has received funding from the European Union s Horizon 2020 Research and Innovation Programme under grant agreement no 687289 Co-ReSyF Hands-on sessions Coastal Waters Research Synergy Framework

More information

NEMO Performance Benchmark and Profiling. May 2011

NEMO Performance Benchmark and Profiling. May 2011 NEMO Performance Benchmark and Profiling May 2011 Note The following research was performed under the HPC Advisory Council HPC works working group activities Participating vendors: HP, Intel, Mellanox

More information

SDP Design for Cloudy Regions

SDP Design for Cloudy Regions SDP Design for Cloudy Regions Markus Dolensky, 11/02/2016 2 ICRAR s Data Intensive Astronomy Group M.B. I.C. R.D. M.D. K.V. C.W. A.W. D.P. R.T. generously borrowed content from above colleagues 3 SDP Subelements

More information

Online intercomparison of models and observations using OGC and community standards

Online intercomparison of models and observations using OGC and community standards Online intercomparison of models and observations using OGC and community standards Alastair Gemmell * Jon Blower Keith Haines Adit Santokhee Reading e-science e Centre, Environmental Systems Science Centre,

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Building Bridges: A System for New HPC Communities

Building Bridges: A System for New HPC Communities Building Bridges: A System for New HPC Communities HPC User Forum 59 LRZ, Garching October 16, 2015 Presenter: Jim Kasdorf Director, Special Projects Pittsburgh Supercomputing Center kasdorf@psc.edu 2015

More information

LTC 2017 Practical lesson

LTC 2017 Practical lesson TEPs @ LTC 2017 Practical lesson Alessandro Marin (Solenix c/o ESA, Italy) - TEP CoreTeam ESA UNCLASSIFIED - For Official Use What will we cover in this class? This lesson is not about: Satellites Earth

More information

Big Data infrastructure and tools in libraries

Big Data infrastructure and tools in libraries Line Pouchard, PhD Purdue University Libraries Research Data Group Big Data infrastructure and tools in libraries 08/10/2016 DATA IN LIBRARIES: THE BIG PICTURE IFLA/ UNIVERSITY OF CHICAGO BIG DATA: A VERY

More information

Working with Scientific Data in ArcGIS Platform

Working with Scientific Data in ArcGIS Platform Working with Scientific Data in ArcGIS Platform Sudhir Raj Shrestha sshrestha@esri.com Hong Xu hxu@esri.com Esri User Conference, San Diego, CA. July 11, 2017 What we will cover today Scientific Multidimensional

More information

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED A Breakthrough in Non-Volatile Memory Technology & 0 2018 FUJITSU LIMITED IT needs to accelerate time-to-market Situation: End users and applications need instant access to data to progress faster and

More information

I data set della ricerca ed il progetto EUDAT

I data set della ricerca ed il progetto EUDAT I data set della ricerca ed il progetto EUDAT Casalecchio di Reno (BO) Via Magnanelli 6/3, 40033 Casalecchio di Reno 051 6171411 www.cineca.it 1 Digital as a Global Priority 2 Focus on research data Square

More information

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,

More information

Metadata Models for Experimental Science Data Management

Metadata Models for Experimental Science Data Management Metadata Models for Experimental Science Data Management Brian Matthews Facilities Programme Manager Scientific Computing Department, STFC Co-Chair RDA Photon and Neutron Science Interest Group Task lead,

More information

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science CSD3 The Cambridge Service for Data Driven Discovery A New National HPC Service for Data Intensive science Dr Paul Calleja Director of Research Computing University of Cambridge Problem statement Today

More information

Interoperability in Science Data: Stories from the Trenches

Interoperability in Science Data: Stories from the Trenches Interoperability in Science Data: Stories from the Trenches Karen Stocks University of California San Diego Open Data for Open Science Data Interoperability Microsoft escience Workshop 2012 Interoperability

More information

Australia s Remotely Sensed Data Archive: The Next 25 Years

Australia s Remotely Sensed Data Archive: The Next 25 Years Australian Government Australia s Remotely Sensed Data Archive: The Next 25 Years Stuart Barr Geospatial and Earth Monitoring Division Introduction The Australian Centre for Remote Sensing (ACRES) is Australia

More information

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations APAN Cloud WG Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations Lift off NCI and Copernicus The National Computational Infrastructure (NCI) in

More information

Comet Virtualization Code & Design Sprint

Comet Virtualization Code & Design Sprint Comet Virtualization Code & Design Sprint SDSC September 23-24 Rick Wagner San Diego Supercomputer Center Meeting Goals Build personal connections between the IU and SDSC members of the Comet team working

More information

DataONE: Open Persistent Access to Earth Observational Data

DataONE: Open Persistent Access to Earth Observational Data Open Persistent Access to al Robert J. Sandusky, UIC University of Illinois at Chicago The Net Partners Update: ONE and the Conservancy December 14, 2009 Outline NSF s Net Program ONE Introduction Motivating

More information

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners 24th Forum ORAP Cite Scientifique; Lille, France March 26, 2009 Don Middleton National

More information

Application Centric Microservices Ken Owens, CTO Cisco Intercloud Services. Redhat Summit 2015

Application Centric Microservices Ken Owens, CTO Cisco Intercloud Services. Redhat Summit 2015 Application Centric Microservices Ken Owens, CTO Cisco Intercloud Services Redhat Summit 2015 Agenda Introduction Why Application Centric Application Deployment Options What is Microservices Infrastructure

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System X idataplex CINECA, Italy The site selection

More information

NetCDF and HDF5. NASA Earth Science Data Systems Working Group October 20, 2010 New Orleans. Ed Hartnett, Unidata/UCAR, 2010

NetCDF and HDF5. NASA Earth Science Data Systems Working Group October 20, 2010 New Orleans. Ed Hartnett, Unidata/UCAR, 2010 NetCDF and HDF5 NASA Earth Science Data Systems Working Group October 20, 2010 New Orleans Ed Hartnett, Unidata/UCAR, 2010 Unidata Mission: To provide the data services, tools, and cyberinfrastructure

More information

Advanced Research Compu2ng Informa2on Technology Virginia Tech

Advanced Research Compu2ng Informa2on Technology Virginia Tech Advanced Research Compu2ng Informa2on Technology Virginia Tech www.arc.vt.edu Personnel Associate VP for Research Compu6ng: Terry Herdman (herd88@vt.edu) Director, HPC: Vijay Agarwala (vijaykag@vt.edu)

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System

More information

EARTHCUBE CONCEPTUAL DESIGN

EARTHCUBE CONCEPTUAL DESIGN EARTHCUBE CONCEPTUAL DESIGN A Scalable Community Driven Architecture http://earthcube.org/group/scalable-community-driven-architecture Overview PI: G. Djorgovski (Caltech) CO-I: D. Pilone, T. Pilone (Element

More information

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang WP4: Data Forum Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang Motivation INTERACT research stations generate data and metadata Long term monitoring Short term process studies External

More information

Processing and analysis of Earth Observation data

Processing and analysis of Earth Observation data Processing and analysis of Earth Observation data Carsten Brockmann, Brockmann Consult GmbH ESA Climate Change Initiative Toolbox Science Lead Big Data Analytics & GIS, Münster 20.-21. September 2017.

More information

Ocean, Atmosphere & Climate Model Assessment for Everyone

Ocean, Atmosphere & Climate Model Assessment for Everyone Ocean, Atmosphere & Climate Model Assessment for Everyone Rich Signell USGS Woods Hole, MA Unidata 2014 DeSouza Award Presentation Boulder, CO : Sep 15, 2014 2 US Integrated Ocean Observing System (IOOS

More information

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,

More information

Technical documentation. SIOS Data Management Plan

Technical documentation. SIOS Data Management Plan Technical documentation SIOS Data Management Plan SIOS Data Management Plan Page: 2/10 SIOS Data Management Plan Page: 3/10 Versions Version Date Comment Responsible 0.3 2017 04 19 Minor modifications

More information

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.

More information

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity Fujitsu High Performance Computing Ecosystem Human Centric Innovation in Action Dr. Pierre Lagier Chief Technology Officer Fujitsu Systems Europe Innovation Flexibility Simplicity INTERNAL USE ONLY 0 Copyright

More information

Copernicus Climate Change Service

Copernicus Climate Change Service Climate Data Store, Toolbox Geneva, 6-8 th of December 2016 Cedric Bergeron cedric.bergeron@ecmwf.int Angel Lopez Alos Baudouin Raoult angel.lopez@ecmwf.int baudouin.raoult@ecmwf.int Budget of 4.3 Bn for

More information

The BioHPC Nucleus Cluster & Future Developments

The BioHPC Nucleus Cluster & Future Developments 1 The BioHPC Nucleus Cluster & Future Developments Overview Today we ll talk about the BioHPC Nucleus HPC cluster with some technical details for those interested! How is it designed? What hardware does

More information

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon. THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology

More information

CERA: Database System and Data Model

CERA: Database System and Data Model CERA: Database System and Data Model Michael Lautenschlager Frank Toussaint World Data Center for Climate (M&D/MPIMET, Hamburg) NINTH WORKSHOP ON METEOROLOGICAL OPERATIONAL SYSTEMS ECMWF, Reading/Berks.,

More information

IMAGERY FOR ARCGIS. Manage and Understand Your Imagery. Credit: Image courtesy of DigitalGlobe

IMAGERY FOR ARCGIS. Manage and Understand Your Imagery. Credit: Image courtesy of DigitalGlobe IMAGERY FOR ARCGIS Manage and Understand Your Imagery Credit: Image courtesy of DigitalGlobe 2 ARCGIS IS AN IMAGERY PLATFORM Empowering you to make informed decisions from imagery and remotely sensed data

More information

CF-netCDF and CDM. Ethan Davis, John Caron, Ben Domenico, Stefano Nativi* UCAR Unidata Univ of Florence*

CF-netCDF and CDM. Ethan Davis, John Caron, Ben Domenico, Stefano Nativi* UCAR Unidata Univ of Florence* CF-netCDF and CDM Ethan Davis, John Caron, Ben Domenico, Stefano Nativi* UCAR Unidata Univ of Florence* OGC in MetOcean, Toulouse, France, November 2009 CF-netCDF and CDM CF-netCDF CDM/netCDF-java TDS

More information