SURFSARA, TECHNOLOGICAL EXPERTISE & SURF E- INFRASTRUCTURE Axel Berg Information event etec-big call NLeSC-SURFsara Amsterdam, April 9, 2019

Size: px
Start display at page:

Download "SURFSARA, TECHNOLOGICAL EXPERTISE & SURF E- INFRASTRUCTURE Axel Berg Information event etec-big call NLeSC-SURFsara Amsterdam, April 9, 2019"

Transcription

1 SURFSARA, TECHNOLOGICAL EXPERTISE & SURF E- INFRASTRUCTURE Axel Berg Information event etec-big call NLeSC-SURFsara Amsterdam, April 9, 2019

2 We are SURF

3 Fields of work Education Flexible education Diverse learning resources Using study data Research Unlimited access World-class facilities Stimulating Open Science Cooperative facilities On campus Security in the digital world User-centred

4 Worldwide LHC Computing Grid National supercomputer service Cartesius 4

5 It s far more than just BIG systems As scientific problems become more complex: - collaborations will grow - data sizes will grow - e-infrastructure demands will grow - requires usually integration of compute and data For future e-infrastructure development this means advances and collaborations in the complete e-infrastructure ecosystem, through cocreation: community services + software (system, middleware, libraries, applications) + algorithms + programming models + workflows + hardware (compute, data, network) + datacenter + operations + support + data management + training + education + integration + federation + you name it LAWRENCE LIVERMORE EXAMPLE 5

6 Photo: SETI Institute Example SETI, the Search for Extraterrestrial Intelligence Exploratory science that seeks evidence of life in the universe by looking for some signature of its technology. Access to national research ICT facilities Data processing of LOFAR telescope data on GRID Applying complex queries on Hadoop cluster using Spark and Scala In-depth scientific and technological expertise to set up data pipelines

7 Convergence of machine learning and largescale numerical simulations Could machine learning approaches enhance traditional HPC workloads, like numerical simulations? Predict simulation outcomes instead of calculating a full simulation using numerical analysis Use data from simulation to train a deep neural network à use for inference analysis to simulate the system being studied simulate coarse grid, input to make predictions at refined grid 7

8 Machine learning enhanced HPC applications Climate modeling Dr. Chiel van Heerwaarden, Wageningen University: Machine-learned turbulence in next-generation weather models High-energy physics Dr. Sacha Caron, Radboud University: Generating physics without an event generator Life Sciences Prof. Alexandre M.J.J. Bonvin, Utrecht University: Distinguishing biological interfaces from crystal artifacts in biomolecular complexes using deep learning Astrophysics Prof. Simon Portegies Zwart, Leiden University: Machine learning for accelerating planetary dynamics in stellar clusters Article Transforming HPC research with AI approaches on The Next Platform -hpc-research-with-ai-approaches/ 8

9 Collaboration SURF research community in service development RESEARCH DOMAIN EXPERTISE.. Data services Software services Nationale ICT infrastructure services CO-CREATION ICT INFRASTRUCTURE EXPERTISE 9

10 Example collaborations

11 Essential component of e-infrastructure: Expert Support Matching e-infrastructure (development) to scientific problem à effective use of the e- infrastructure Co-creation: discipline expertise and e- infrastructure development and implementation expertise Scalability of e-infrastructure and applications Close collaboration with scientists, NLeSC and SURF Performance optimisation and parallelisation Algorithm improvements I/O optimisation Accelerator optimization Treatment of large datasets Distributed computing Scalable machine learning Data analytics Streaming data pipelines Research data management Visualisation (remote, in-situ) Etc. 11

12 Connecting research challenges and ICT solutions Which ICT service is most suited for my research challenge? 5 Areas of Support National and International Access Mode Routing How do I gain access to the national and international research facilities and which access mode is most suited (institutional access, excellence-driven, wide access, international)? Optimalizing ICT services SaaS > How to optimise software? PaaS > How can I make use of discipline specific research environment platforms? IaaS > How to optimise the ICT infrastructure with parallel processing and job batching? Analysing & visualising How to gain insight and meaning from research data? How to get results from complex and unstructured data sets? How do I visualise my research data? Managing research data How to handle with data before, during and after my research? How to collect, collaborate, manage, transfer, store, find and reuse data?

13 The national digital infrastructure Access to a wide range of reliable services to process, analyse, store and share data Available to all researchers, independent of the scientific domain Superior digital infrastructure is crucial to the Dutch knowledge economy

14 Digital infrastructure for research in Netherlands LARGE-SCALE SCIENTIFIC INFRASTRUCTURE: Report Access Compute Storage Network Visualization Authentication & Authorization Research ICT support NL e-infra Research Institutes Store 14

15 What SURF does for research (Re)search High-end Compute services Supercomputing Cluster computing Visualisation Report Access Data processing Scalable data analytics Grid services HPC cloud Analyse RESEARCH DATA LIFECYCLE Store Data services Online data services Data preservation services Data management services Network & collaboration services Optimise Translate Fast networks for data transfers and compute scale out Federated access & identity management solutions Integrate Compute 15

16 Computing services Dutch Supercomputer Cartesius large scale parallel applications Lisa Computer Cluster processing power with user friendliness Visualisation powerfull remote visualization Grid high-end distributed data processing HPC Cloud high-performance cloud for research Available compute power >2 PFlops On > CPU cores >450 GPUs 16

17 Scalable Data Analytics Exploration / Mining of Big Data Streaming data (real time analysis) 2 to 6 million tweets/day Prof. Antal van den Bosch Radboud University, Meertens Institute Spark framework, flexible infrastructure built on Kubernetes, training with Jupyter notebooks 17

18 SURF Data services Service Features Intended for Long/ short term SURF Research Drive One view on data Collaborative working and sharing of Short/ research data long term SURFdrive DropBox like Online collaboration Short term SURFfilesender WeTransfer like with encryption HPC central data archive Tiered, Dual copies, supports NFs/ ssh/scp/gridftp PID service Identifier service for datasets Online collaboration For use with super compute facilities Publishing data Short term Long term preservation Long term SURF Object Store S3/REST API Online storage for large data volumes Short term SURF Data Management platform Metadata handling Policy based data orchestration Long term SURF object storage

19 The collaborative environment that supports open science Analyse Describe Manage HPC Process National Super Computer HPC Cloud Services Publish Collect Access Find PID Service Object Store Data Archive Persistent Identifiers Online S3/REST API Offline sharekit Data Preservation Future Concept

20 Identity Management Services Identity Providers Service Providers SURFconext - access to cloud services - federative login (Institutional account, log in once, receive access everywhere) Science Collaboration Zone (pilot) - access to (research) specific services - federative login AND - guests - teams - international 20

21 ICT, expertise & collaboration ICT services Knowledge & expertise Collaboration -Large-scale computations, simulations & modelling on Cartesius supercomputer, Lisa cluster, HPC cloud* -Accessing the Grid for large-scale data processing -PaaS environment tools, compilers, libraries -Performance optimization on computing facilities -Running your software in parallel for faster processing -Machine learning, big data analysis, visualization -Training -consolidation of research communities in the Netherlands -multidisciplinary approach -collaboration local and national research infrastructures -Network-connectivity -Lightpath: very quickly send and receive data; direct connection shielded from the Internet; extra secure -security, trust & identity -Integrating your virtual infrastructure into your work processes -Science Collaboration Zone for authorization & authentication -International collaboration -Strong international orientation: high level of participation in European research programs -Accessing the Grid for data processing worldwide -Data storage and sharing services, long term archiving, - -FAIR datamanagement, datastewardship -Methods for approaching your data -The exact design of your data storage system -How to organise your data infrastructure -Meeting the FAIR data principles -public-private cooperation -datahubs, strategic partnerships with private sector *Regular Compute Call via NWO (duration 2 yrs, start within 2 months, combination of SURF compute services allowed. Cartesius supercomputer >500,000 core hours, Lisa Compute cluster >100,000 core hours, HPC Cloud >50,000 core hours, Grid >500,000 core hours(max data 200TB disk/300tb tape) *Compute access via SURF (Cartesius supercomputer <500,000 core hours, Lisa Cluster <100,000, Grid Data processing <500,000, HPC Cloud <50,000). Max 1 application per research project per year. More info about services: SURF.nl/research or via support4research@surf.nl

22 THANK YOU! SURF Open Innovation Lab 22 Driving innovationtogether

23 Twi-NL Project of Meertens Institute (previously NL-eSC and Radboud university) Collection of Dutch tweets for scientific research About 40% of all Dutch tweets Used by CBS and others

24 The Green Village Innovation lab at Delft Multiple projects, lots of sensor data Many projects share (streaming) data SURFsara offers platform based on Apache Kafka

25 Project MinE: ALS disease DNA samples 2 million gigabytes data 8 million compute hours Scalability of application Data analysis expertise

26 Project MinE is a large-scale research initiative devoted to discovering the genetic cause of ALS Data from 20 participating countries Data upload to SURFsara Data analysis at SURFsara Plan to map the full DNA profiles of at least 15,000 people with ALS Compare these to DNA profiles of 7,500 control subjects 46% DNA profiles collected so far Data is uploaded to SURFsara stored on Tape (dcache) Data uploaded through third party (e.g., illumina) or directly by the participating country/contributing institute More than 1PB of data already uploded Data Analysis is run on Grid (Gina and Nikhef) Users from various countries can run the analysis themselves More than 4 million core hours of compute already performed

27 Our support model for the project A ProjectMinE dedicated user interface is made available with: * All software tools to submit jobs to grid (e.g., grid middleware) * Grid Storage clients (incl. setting up of Globus connect personal endpoint) * Read only NFS mount of dcache to access data On the dcache storage system data directories have ACLs so that * Only specified users can access the data in each country s directory * Users can have their own private directories For analysis on grid * RCauth proxy authentication for all users * Access to long queue (96 hours walltime) * Dedicated shared softdrive space for software installation * Individual space on softdrive for their own pipelines * Access to pilot job frameworks (PicaS)

28 CERN Openlab Dell EMC COLLABORATIONS UNDER IPCC GENCI/CINES/INRIA Radboud University Medical Center Max Planck Institute Netherlands Cancer Institute (NKI) 28

29 29

30 Dell-SURFsara-Intel CheXNet CheXNet (from Stanford) is a model for identifying thoracic pathologies from the NIH ChestXray14 dataset DenseNet121 topology Pretrained on ImageNet Dataset contains 112K images Multicategory / Multilabel Unbalanced 30

31 ResNet-50 and beyond ResNet-50 provides great scalability and improved results ResNet-59, with 896x896 input improves accuracy further It takes 50 minutes to train ResNet-59 on ChestXRay-14 using 240 nodes for higher accuracy Challenge encountered: Large memory usage for largeinput models

32 Exploring scalable, accurate AI radiology models High computational and memory requirements! DenseNet-121, 224x224 AmoebaNet-D architecture ResNet-50, 224x224 Mean AUROC ResNet-59, 896x896 AmoebaNet AmoebaNet (4,256), 299x299 (2,512), 480x480 ResNet50 ResNet59 AmoebaNet (4,256) No. parameters 22M 90M 80M 168M AmoebaNet (2,512) Input size 224x x x x480 Training throughput [img/s/node] Memory consumption/ batch of 64 images [GB] Approximate training time on 256 nodes [minutes] Top-1 accuracy on ImageNet-1K % 78.1% 79.9% 80.9%

33 33

34 World flora classification Encountered difficulties - Protobuf library is limited to 2GB files: - Impossible to serialize ResNet-50 model with around 300K classes - Fully connected layer gets very large (~2GB) - Communication bottlenecks at large node counts - Dataset is very large (1.5TB, 11.5M images) and poses problems to Lustre filesystems Used CEA Irene cluster (1600 Skylake 2S nodes) 34

35 294K classes 294K classes solutions Protobuf 2GB files - dimensionality reduction trick ResNet-50 Network size: 2.3GB ResNet-50 Network size: 1.8GB No striping 47.5 img/s (23.7 img/s/node) Improving Lustre performance 2 nodes 32 nodes Scaling efficiency 303 img/s (9.5 img/s/node) 40.1% Lustre striping (SC: 64,SS: 32M) 47.5 img/s (23.7 img/s/node) 688 img/s (21.5 img/s/node) 90.7% 35

36 results (Skylake) Top1 accuracy [%] Top5 accuracy [%] Training time Scalability on Irene Irene: 512 skylake nodes - ResNet-50 Irene: 1320 skylake nodes - ResNet-50 Irene: 1024 skylake nodes - ResNet-50 - Collapsed Ensemble minutes/epoch minutes/epoch minutes/epoch Huge improvements in model accuracy! images/second Ideal 36

SURFsara Data Services

SURFsara Data Services SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Axel Berg Dutch Scientific Challenges From High Energy Physics to atomic and molecular physics (DNA); Life sciences (cell biology); Human interaction

More information

e-research Infrastructures for e-science Axel Berg SARA national HPC & e-science support center RAMIRI, June 15, 2011

e-research Infrastructures for e-science Axel Berg SARA national HPC & e-science support center RAMIRI, June 15, 2011 e-research Infrastructures for e-science Axel Berg SARA national HPC & e-science support center RAMIRI, June 15, 2011 Science Park Amsterdam a world of science in a city of inspiration > Faculty of Science

More information

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research Dr Paul Calleja Director of Research Computing University of Cambridge Global leader in science & technology

More information

National R&E Networks: Engines for innovation in research

National R&E Networks: Engines for innovation in research National R&E Networks: Engines for innovation in research Erik-Jan Bos EGI Technical Forum 2010 Amsterdam, The Netherlands September 15, 2010 Erik-Jan Bos - Chief Technology Officer at Dutch NREN SURFnet

More information

Connecting the e-infrastructure chain

Connecting the e-infrastructure chain Connecting the e-infrastructure chain Internet2 Spring Meeting, Arlington, April 23 rd, 2012 Peter Hinrich & Migiel de Vos Topics - About SURFnet - Motivation: Big data & collaboration - Collaboration

More information

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality

More information

A national approach for storage scale-out scenarios based on irods

A national approach for storage scale-out scenarios based on irods A national approach for storage scale-out scenarios based on irods Christine Staiger Ton Smeele SURFsara Utrecht University Science Park 140, ITS/RDM Amsterdam, The Heidelberglaan 8, Netherlands Utrecht,

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science CSD3 The Cambridge Service for Data Driven Discovery A New National HPC Service for Data Intensive science Dr Paul Calleja Director of Research Computing University of Cambridge Problem statement Today

More information

Coupled Computing and Data Analytics to support Science EGI Viewpoint Yannick Legré, EGI.eu Director

Coupled Computing and Data Analytics to support Science EGI Viewpoint Yannick Legré, EGI.eu Director Coupled Computing and Data Analytics to support Science EGI Viewpoint Yannick Legré, EGI.eu Director yannick.legre@egi.eu Credit slides: T. Ferrari www.egi.eu This work by EGI.eu is licensed under a Creative

More information

towards a federated infrastructure enabling integrated life science research

towards a federated infrastructure enabling integrated life science research towards a federated infrastructure enabling integrated life science research ELIXIR Innovation & SME forum March 18 2015 Ruben Kok & Jaap Heringa www.dtls.nl ZOOMING IN AND OUT OF LIFE LIFE @ ALL LEVELS

More information

2013 AWS Worldwide Public Sector Summit Washington, D.C.

2013 AWS Worldwide Public Sector Summit Washington, D.C. 2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic

More information

How Five International Networks are Enabling International Data-Intensive Research. Internet2 Global Summit 2014

How Five International Networks are Enabling International Data-Intensive Research. Internet2 Global Summit 2014 How Five International Networks are Enabling International Data-Intensive Research Internet2 Global Summit 2014 CONTENTS Brief introduction to EYR and EYR-Global Introduction to 2 selected projects Large

More information

Data Analytics with HPC. Data Streaming

Data Analytics with HPC. Data Streaming Data Analytics with HPC Data Streaming Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data

More information

Building a Dutch National Research Infrastructure IRODS UGM 2017

Building a Dutch National Research Infrastructure IRODS UGM 2017 Building a Dutch National Research Infrastructure IRODS UGM 2017 Frank Heere 15-06-2017 SURF: who are we? Not-for-profit cooperative for ICT in Dutch education and research Knowledge sharing Shared digital

More information

Dell EMC All-Flash solutions are powered by Intel Xeon processors. Learn more at DellEMC.com/All-Flash

Dell EMC All-Flash solutions are powered by Intel Xeon processors. Learn more at DellEMC.com/All-Flash N O I T A M R O F S N A R T T I L H E S FU FLA A IN Dell EMC All-Flash solutions are powered by Intel Xeon processors. MODERNIZE WITHOUT COMPROMISE I n today s lightning-fast digital world, your IT Transformation

More information

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license Inge Van Nieuwerburgh OpenAIRE NOAD Belgium Tools&Services OpenAIRE EUDAT can be reused under the CC BY license Open Access Infrastructure for Research in Europe www.openaire.eu Research Data Services,

More information

The Digitising European Industry strategy & H2020 calls related to Cyber-Physical Systems

The Digitising European Industry strategy & H2020 calls related to Cyber-Physical Systems The Digitising European Industry strategy & H2020 calls related to Cyber-Physical Systems #DigitiseEU Dr. Werner Steinhögl European Commission - DG CONNECT Technologies and Systems for Digitising Industry

More information

N. Marusov, I. Semenov

N. Marusov, I. Semenov GRID TECHNOLOGY FOR CONTROLLED FUSION: CONCEPTION OF THE UNIFIED CYBERSPACE AND ITER DATA MANAGEMENT N. Marusov, I. Semenov Project Center ITER (ITER Russian Domestic Agency N.Marusov@ITERRF.RU) Challenges

More information

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice 2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved. BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST 1 UNSTRUCTURED DATA GROWTH 75% 78% 80% 2015 71 EB 2016 106 EB 2017 133 EB Total Capacity Shipped, Worldwide % of Unstructured Data

More information

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved.

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved. Power of the Portfolio 1 VMAX / VPLEX K-12 School System District seeking system to support rollout of new VDI implementation Customer found Vblock to be superior solutions versus competitor Customer expanded

More information

ODC and future EIDA/ EPOS-S plans within EUDAT2020. Luca Trani and the EIDA Team Acknowledgements to SURFsara and the B2SAFE team

ODC and future EIDA/ EPOS-S plans within EUDAT2020. Luca Trani and the EIDA Team Acknowledgements to SURFsara and the B2SAFE team B2Safe @ ODC and future EIDA/ EPOS-S plans within EUDAT2020 Luca Trani and the EIDA Team Acknowledgements to SURFsara and the B2SAFE team 3rd Conference, Amsterdam, The Netherlands, 24-25 September 2014

More information

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc Atos announces the Bull sequana X1000 the first exascale-class supercomputer Jakub Venc The world is changing The world is changing Digital simulation will be the key contributor to overcome 21 st century

More information

Adding Cloud Based Interactive Compute Capabilities to Globus Endpoints

Adding Cloud Based Interactive Compute Capabilities to Globus Endpoints Adding Cloud Based Interactive Compute Capabilities to Globus Endpoints Ben Galewsky Research Programmer, National Center for Supercomputing Applications bengal1@illinois.edu http://www.nationaldataservice.org/

More information

IM&T Data Management Strategy Overview March 2013 CSIRO INFORMATION MANAGEMENT & TECHNOLOGY

IM&T Data Management Strategy Overview March 2013 CSIRO INFORMATION MANAGEMENT & TECHNOLOGY IM&T Data Management Strategy Overview March 2013 CSIRO INFORMATION MANAGEMENT & TECHNOLOGY What is the CSIRO Research Data Service? RDS is a newly established service developed by IM&T that delivers a

More information

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5 TECHNOLOGY BRIEF Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5 ABSTRACT Xcellis represents the culmination of over 15 years of file system and data management

More information

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside.

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside. MODERNISE WITH ALL-FLASH Intel Inside. Powerful Data Centre Outside. MODERNISE WITHOUT COMPROMISE In today s lightning-fast digital world, it s critical for businesses to make their move to the Modern

More information

Transforming IT: From Silos To Services

Transforming IT: From Silos To Services Transforming IT: From Silos To Services Chuck Hollis Global Marketing CTO EMC Corporation http://chucksblog.emc.com @chuckhollis IT is being transformed. Our world is changing fast New Technologies New

More information

European Cloud Initiative: implementation status. Augusto BURGUEÑO ARJONA European Commission DG CNECT Unit C1: e-infrastructure and Science Cloud

European Cloud Initiative: implementation status. Augusto BURGUEÑO ARJONA European Commission DG CNECT Unit C1: e-infrastructure and Science Cloud European Cloud Initiative: implementation status Augusto BURGUEÑO ARJONA European Commission DG CNECT Unit C1: e-infrastructure and Science Cloud Political drivers for action EC Communication "European

More information

Inauguration Cartesius June 14, 2013

Inauguration Cartesius June 14, 2013 Inauguration Cartesius June 14, 2013 Hardware is Easy...but what about software/applications/implementation/? Dr. Peter Michielse Deputy Director 1 Agenda History Cartesius Hardware path to exascale: the

More information

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore CLOUDIAN + QUANTUM REFERENCE ARCHITECTURE 1 Table of Contents Introduction to Quantum StorNext 3 Introduction to Cloudian HyperStore 3 Audience

More information

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016 Ran into Miron at a workshop recently.. Talked about data and the challenges of providing

More information

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s

More information

irods usage at CC-IN2P3: a long history

irods usage at CC-IN2P3: a long history Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods usage at CC-IN2P3: a long history Jean-Yves Nief Yonny Cardenas Pascal Calvat What is CC-IN2P3? IN2P3:

More information

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED A Breakthrough in Non-Volatile Memory Technology & 0 2018 FUJITSU LIMITED IT needs to accelerate time-to-market Situation: End users and applications need instant access to data to progress faster and

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations

Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations APAN Cloud WG Challenges of Big Data Movement in support of the ESA Copernicus program and global research collaborations Lift off NCI and Copernicus The National Computational Infrastructure (NCI) in

More information

DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs.

DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs. DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs. Public for HPC HPC End Users Cite Mixed I/O as the Most Difficult Performance Challenge

More information

AUTOMATE THE DEPLOYMENT OF SECURE DEVELOPER VPCs

AUTOMATE THE DEPLOYMENT OF SECURE DEVELOPER VPCs AUTOMATE THE DEPLOYMENT OF SECURE DEVELOPER VPCs WITH PALO ALTO NETWORKS AND REAN CLOUD 1 INTRODUCTION EXECUTIVE SUMMARY Organizations looking to provide developers with a free-range development environment

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

Predicting Service Outage Using Machine Learning Techniques. HPE Innovation Center

Predicting Service Outage Using Machine Learning Techniques. HPE Innovation Center Predicting Service Outage Using Machine Learning Techniques HPE Innovation Center HPE Innovation Center - Our AI Expertise Sense Learn Comprehend Act Computer Vision Machine Learning Natural Language Processing

More information

2017 Resource Allocations Competition Results

2017 Resource Allocations Competition Results 2017 Resource Allocations Competition Results Table of Contents Executive Summary...3 Computational Resources...5 CPU Allocations...5 GPU Allocations...6 Cloud Allocations...6 Storage Resources...6 Acceptance

More information

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved. Apache Hadoop 3 Balazs Gaspar Sales Engineer CEE & CIS balazs@cloudera.com 1 We believe data can make what is impossible today, possible tomorrow 2 We empower people to transform complex data into clear

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics

More information

EMC ISILON HARDWARE PLATFORM

EMC ISILON HARDWARE PLATFORM EMC ISILON HARDWARE PLATFORM Three flexible product lines that can be combined in a single file system tailored to specific business needs. S-SERIES Purpose-built for highly transactional & IOPSintensive

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

SILECS Super Infrastructure for Large-scale Experimental Computer Science

SILECS Super Infrastructure for Large-scale Experimental Computer Science Super Infrastructure for Large-scale Experimental Computer Science Serge Fdida (UPMC) Frédéric Desprez (Inria) Christian Perez (Inria) INRIA, CNRS, RENATER, CEA, CPU, CDEFI, IMT, Sorbonne Universite, Universite

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training

More information

Emerging Technologies for HPC Storage

Emerging Technologies for HPC Storage Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1 1 DDN DDN Updates DataDirect Neworks Japan, Inc Nobu Hashizume DDN Storage 2018 DDN Storage 1 2 DDN A Broad Range of Technologies to Best Address Your Needs Your Use Cases Research Big Data Enterprise

More information

HPC & Quantum Technologies in Europe

HPC & Quantum Technologies in Europe 64 th HPC User Forum HPC & Quantum Technologies in Europe Dr Gustav Kalbe Head of Unit High Performance Computing & Quantum Technologies DG CONNECT, European Commission European HPC Strategy & Support

More information

NEW CONVERGED APPROACH FOR SAP POWERED BY ATOS

NEW CONVERGED APPROACH FOR SAP POWERED BY ATOS NEW CONVERGED APPROACH FOR SAP POWERED BY ATOS Michael Schmitter, Atos Tim Wörfel, Hitachi Vantara 28.02.2018 HITACHI and Atos Partnership More 9 Years Partnership Partnership covers main areas of the

More information

Striving for efficiency

Striving for efficiency Ron Dekker Director CESSDA Striving for efficiency Realise the social data part of EOSC How to Get the Maximum from Research Data Prerequisites and Outcomes University of Tartu, 29 May 2018 Trends 1.Growing

More information

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society.

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society. Join the Innovation Qatar Computing Research Institute (QCRI) is a national research institute established in 2010 by Qatar Foundation for Education, Science and Community Development. As a primary constituent

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack Robert Collazo Systems Engineer Rackspace Hosting The Rackspace Vision Agenda Truly a New Era of Computing 70 s 80 s Mainframe Era 90

More information

Accelerating Digital Transformation with InterSystems IRIS and vsan

Accelerating Digital Transformation with InterSystems IRIS and vsan HCI2501BU Accelerating Digital Transformation with InterSystems IRIS and vsan Murray Oldfield, InterSystems Andreas Dieckow, InterSystems Christian Rauber, VMware #vmworld #HCI2501BU Disclaimer This presentation

More information

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage DDN DDN Updates Data DirectNeworks Japan, Inc Shuichi Ihara DDN A Broad Range of Technologies to Best Address Your Needs Protection Security Data Distribution and Lifecycle Management Open Monitoring Your

More information

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE PHILIP HEAH ASSISTANT CHIEF EXECUTIVE TECHNOLOGY & INFRASTRUCTURE GROUP LAUNCH OF SERVICES AND DIGITAL ECONOMY (SDE) TECHNOLOGY ROADMAP (NOV 2018) Source

More information

Travelling securely on the Grid to the origin of the Universe

Travelling securely on the Grid to the origin of the Universe 1 Travelling securely on the Grid to the origin of the Universe F-Secure SPECIES 2007 conference Wolfgang von Rüden 1 Head, IT Department, CERN, Geneva 24 January 2007 2 CERN stands for over 50 years of

More information

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes

More information

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch Preparing for High-Luminosity LHC Bob Jones CERN Bob.Jones cern.ch The Mission of CERN Push back the frontiers of knowledge E.g. the secrets of the Big Bang what was the matter like within the first

More information

Fast Hardware For AI

Fast Hardware For AI Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Distributing storage of LHC data - in the nordic countries

Distributing storage of LHC data - in the nordic countries Distributing storage of LHC data - in the nordic countries Gerd Behrmann INTEGRATE ASG Lund, May 11th, 2016 Agenda WLCG: A world wide computing grid for the LHC NDGF: The Nordic Tier 1 dcache: Distributed

More information

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon. THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology

More information

Pervasive DataRush TM

Pervasive DataRush TM Pervasive DataRush TM Parallel Data Analysis with KNIME www.pervasivedatarush.com Company Overview Global Software Company Tens of thousands of users across the globe Americas, EMEA, Asia ~230 employees

More information

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing Zheng Meyer-Zhao - zheng.meyer-zhao@surfsara.nl Consultant Clustercomputing Outline SURFsara About us What we do Cartesius and Lisa Architectures and Specifications File systems Funding Hands-on Logging

More information

High Performance Computing from an EU perspective

High Performance Computing from an EU perspective High Performance Computing from an EU perspective DEISA PRACE Symposium 2010 Barcelona, 10 May 2010 Kostas Glinos European Commission - DG INFSO Head of Unit GÉANT & e-infrastructures 1 "The views expressed

More information

Richard Curran :Security Officer EMEA. Mario Romao : Senior Manager Policy, Intel

Richard Curran :Security Officer EMEA. Mario Romao : Senior Manager Policy, Intel Richard Curran :Security Officer EMEA Mario Romao : Senior Manager Policy, Intel Digital Convergence Across All Industries Traditional Economy Digital Convergence Blending of Traditional and Digital Business

More information

HPC IN EUROPE. Organisation of public HPC resources

HPC IN EUROPE. Organisation of public HPC resources HPC IN EUROPE Organisation of public HPC resources Context Focus on publicly-funded HPC resources provided primarily to enable scientific research and development at European universities and other publicly-funded

More information

NorStore. a national infrastructure for scientific data. Andreas O Jaunsen UNINETT Sigma as

NorStore. a national infrastructure for scientific data. Andreas O Jaunsen UNINETT Sigma as NorStore a national infrastructure for scientific data Andreas O Jaunsen UNINETT Sigma as About UNINETT Sigma UNINETT Sigma AS is a private company established by the Ministry of science and education

More information

Private Cloud at IIT Delhi

Private Cloud at IIT Delhi Private Cloud at IIT Delhi Success Story Engagement: Long Term Industry: Education Offering: Private Cloud Deployment Business Challenge IIT Delhi, one of the India's leading educational Institute wanted

More information

A Big Big Data Platform

A Big Big Data Platform A Big Big Data Platform John Urbanic, Parallel Computing Scientist 2017 Pittsburgh Supercomputing Center The Shift to Big Data New Emphases Pan-STARRS telescope http://pan-starrs.ifa.hawaii.edu/public/

More information

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider Fault Detection using Advanced Analytics at CERN's Large Hadron Collider Antonio Romero Marín Manuel Martin Marquez USA - 27/01/2016 BIWA 16 1 What s CERN USA - 27/01/2016 BIWA 16 2 What s CERN European

More information

Building Bridges: A System for New HPC Communities

Building Bridges: A System for New HPC Communities Building Bridges: A System for New HPC Communities HPC User Forum 59 LRZ, Garching October 16, 2015 Presenter: Jim Kasdorf Director, Special Projects Pittsburgh Supercomputing Center kasdorf@psc.edu 2015

More information

New Zealand Government IbM Infrastructure as a service

New Zealand Government IbM Infrastructure as a service New Zealand Government IbM Infrastructure as a service Global leverage / local experts World-class Scalable Agile Flexible Fast Secure What are we offering? IBM New Zealand Government Infrastructure as

More information

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider Andrew Washbrook School of Physics and Astronomy University of Edinburgh Dealing with Data Conference

More information

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Creating a Recommender System. An Elasticsearch & Apache Spark approach Creating a Recommender System An Elasticsearch & Apache Spark approach My Profile SKILLS Álvaro Santos Andrés Big Data & Analytics Solution Architect in Ericsson with more than 12 years of experience focused

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel National Center for Supercomputing Applications University of Illinois

More information

Mainframe Backup Modernization Disk Library for mainframe

Mainframe Backup Modernization Disk Library for mainframe Mainframe Backup Modernization Disk Library for mainframe Mainframe is more important than ever itunes Downloads Instagram Photos Twitter Tweets Facebook Likes YouTube Views Google Searches CICS Transactions

More information

Flexible HPC for Bio-informatics. Peter Clapham

Flexible HPC for Bio-informatics. Peter Clapham Flexible HPC for Bio-informatics Peter Clapham Overview Overview of the Sanger Institute How our data flow works today New scientific demands Private cloud deployment Transitional and future challenges

More information

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without

More information

Secure, scalable storage made simple. OEM Storage Portfolio

Secure, scalable storage made simple. OEM Storage Portfolio Secure, scalable storage made simple. OEM Storage Portfolio P Data is the currency of the digital economy. It s the new oil and the lifeblood of your organization. But, how to manage it all? How can you

More information

AI for HPC and HPC for AI Workflows: The Differences, Gaps and Opportunities with Data Management

AI for HPC and HPC for AI Workflows: The Differences, Gaps and Opportunities with Data Management AI for HPC and HPC for AI Workflows: The Differences, Gaps and Opportunities with Data Management @SC Asia 2018 Rangan Sukumar, PhD Office of the CTO, Cray Inc. Safe Harbor Statement This presentation

More information

LEADERS IN DATA SCIENCE

LEADERS IN DATA SCIENCE UNDERSTANDING DATA The ability to extract valuable and actionable information from data is a key factor in building competitive advantage in today s economy. However, to analyse data effectively and efficiently

More information

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis W I S S E N n T E C H N I K n L E I D E N S C H A F T Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis Elisabeth Lex KTI, TU Graz WS 2015/16 u www.tugraz.at

More information

ALICE Grid Activities in US

ALICE Grid Activities in US ALICE Grid Activities in US 1 ALICE-USA Computing Project ALICE-USA Collaboration formed to focus on the ALICE EMCal project Construction, installation, testing and integration participating institutions

More information

The EuroHPC strategic initiative

The EuroHPC strategic initiative Amsterdam, 12 December 2017 The EuroHPC strategic initiative Thomas Skordas Director, DG CONNECT-C, European Commission The European HPC strategy in Horizon 2020 Infrastructure Capacity of acquiring leadership-class

More information

by Cisco Intercloud Fabric and the Cisco

by Cisco Intercloud Fabric and the Cisco Expand Your Data Search and Analysis Capability Across a Hybrid Cloud Solution Brief June 2015 Highlights Extend Your Data Center and Cloud Build a hybrid cloud from your IT resources and public and providerhosted

More information