KerData: Scalable Data Management on Clouds and Beyond

Size: px
Start display at page:

Download "KerData: Scalable Data Management on Clouds and Beyond"

Transcription

1 KerData: Scalable Data Management on Clouds and Beyond Gabriel Antoniu INRIA Rennes Bretagne Atlantique Research Centre Franco-British Workshop on Big Data in Science, 6-7 November 2012

2 The French Institute for Research in ICST Information and Communication Science and Technologies RESEARCH TECHNOLOGY DEVELOPMENT AND EXPERIMENTATION EDUCATION AND TRAINING TRANSFER AND INNOVATION A scientific and technological public institution under the dual authority of the Ministry of Research and the Ministry of Industry 2

3 Inria Key figures 4,290 People (60 % paid by INRIA) 8 Research Centers in France 41 International conferences 69 Associated teams throughout the world 4,850 Scientific publications 3,430 Scientists 1,000 Doctoral students 100 Post-Doctoral 300 R&D engineers 171 Project teams A BUDGET OF 252M Of which more than 25% from external resources 271 Active patents (in total) 1,000 Software 3

4 5 Main research topics 1 Applied Mathematics, Computation and Simulation 2 Algorithmics, Programming, Software and Architecture 3 Networks, Systems and Services, Distributed Computing 4 Perception, Cognition, Interaction 5 Computational Sciences for Biology, Medicine and Environment 4

5 Inria s Research Centres Inria LILLE Nord Europe Inria PARIS - Rocquencourt Inria NANCY Grand Est Inria SACLAY Île-de-France Inria RENNES Bretagne Atlantique Inria GRENOBLE Rhône-Alpes Inria BORDEAUX Sud-Ouest Inria SOPHIA ANTIPOLIS Méditerranée 5

6 Inria Project-Team Inria Project Team 10 to 30 people working under a scientific leader supervision Focused scientific theme and international evaluation A limited lifespan : average 8 years, maximum 12 years Well-defined objectives and work program Linked to and cooperating with industrial and scientific partners in France and around the world A priori and a posteriori evaluation 171 Inria Project- Teams in in partnership An organization that complements the universities 6

7 Context for KerData: the Data Deluge Deliver the capability to mine, search and analyze this data in near real time Science itself is evolving Credits: Microsoft 7

8 The Data Science: The 4 th Paradigm for Scientific Discovery. a a 2 = 4πGρ Κ 3 c a 2 2 Thousand years ago Description of natural phenomena Last few hundred years Newton s laws, Maxwell s equations Last few decades Simulation of complex phenomena Today and the Future Unify theory, experiment and simulation with large multidisciplinary Data Using data exploration and data mining (from instruments, sensors, humans ) Crédits: Dennis Gannon Distributed Communities 8

9 The Data Science: The 4 th Paradigm for Scientific Discovery. a a 2 = 4πGρ Κ 3 c a 2 2 Thousand years ago Description of natural phenomena Last few hundred years Newton s laws, Maxwell s equations Last few decades Simulation of complex phenomena Today and the Future Unify theory, experiment and simulation with large multidisciplinary Data Using data exploration and data mining (from instruments, sensors, humans ) Distributed Communities 9

10 KerData s Focus: How to efficiently store and share data for new-generation, data-intensive applications? Scientific challenges Massive data (1 object = 1 TB) Geographically distributed Fine-grain access (MB) for reading and writing High concurrency (10³ concurrent clients) Without locking Major goal: high-throughput under heavy concurrency Our contribution Design and implementation of distributed algorithms Validation with real apps on real platforms with real users 10

11 KerData s Core Research Challenges Applications Massive data analysis: clouds (e.g. MapReduce) Post-Petascale HPC simulations: supercomputers Focus 1: Scalable data storage on IaaS and PaaS clouds Challenge : understand how to reconcile performance, scalability, security and quality of service according to the requirements of data-intensive applications Our Answer: BlobSeer Focus 2: Scalable data storage on Post-Petascale HPC systems Challenge: go beyond the limitations of current file-based approaches Our Answer: Damaris Key aspect : multi-disciplinary collaborative research with application communities 11

12 Focus 1: Scalable Data Storage on IaaS and Paas Clouds

13 BlobSeer: A Software Platform for Scalable, Distributed BLOB Management Started in 2008, 6 PhD theses (Gilles Kahn/SPECIF PhD Thesis Award in 2011) Main goal: optimized for concurrent accesses under heavy concurrency Three key ideas Decentralized metadata management Lock-free concurrent writes (enabled by versioning) - Write = create new version of the data Data and metadata patching rather than updating A back-end for higher-level data management systems Short term: highly scalable distributed file systems Middle term: storage for cloud services Our approach Design and implementation of distributed algorithms Experiments on the Grid 5000 grid/cloud testbed Validation with real apps on real platforms: Nimbus, Azure, OpenNebula clouds 13

14 Impact of BlobSeer: MapReduce BlobSeer improves Hadoop Gain (execution time) : 35% ANR MapReduce Project ( ) Lead: G. Antoniu (KerData) Partners: INRIA (AVALON), Argonne National Lab, U. Illinois Urbana- Champaign, IBM, JLPC, IBCP, MEDIT Strong collaboration with the Nimbus team from Argonne National Lab BlobSeer integrated with the Nimbus cloud toolkit BlobSeer used for efficient VM deployment and snapshotting Validation : Grid 5000 with Nimbus, FutureGrid (USA), Open Cirrus (USA) 14

15 Impact of BlobSeer: The A-Brain Microsoft Research Inria Project KerData, PARIETAL teams at INRIA Brain image Genetic data European Microsoft Innovation Center (Aachen) p(, ) Y Anatomical MRI Functional MRI Diffusion MRI q~ p~10 6 X DNA array (SNP/CNV) gene expression data others... N~2000 TomusBlobs software (based on BlobSeer) Gain / Blobs Azure : 45% Scalability : 350 cores Demo available! 15

16 Scalable Storage on Clouds: Our 4-Year Strategy BlobSeer Nimbus ANR MapReduce ADT BlobSeer Clouds Microsoft Azure A-Brain OpenNebula FP7 SCALUS PaaS : advanced services for data storage, sharing and processing IaaS : VM deployment and checkpointing services Challenge : understand how to reconcile performance, scalability, security and quality of service according to the application requirements Autonomy, adaptive consistency, dynamic elasticity Trade-offs exposed to the user Our tools: algorithmes/distributed protocols, software, distributed platforms Collaboration with application communities 16

17 Focus 2: Scalable Data Storage for Post-Petascale HPC

18 Context : Data Management in Post-Petascale HPC Systems cores PetaBytes of data ~ cores ² Problem: simulations generate TB/min ² How to store et transfer data? ² How to analyze, visualize and extract knowledge? 18 18

19 Context: Data Management in Post-Petascale HPC Systems cores PetaBytes of data Does not scale! ~ cores ² Problem: simulations generate TB/min ² How to store et transfer data? ² How to analyze, visualize and extract knowledge? What is difficult? Too many files (e.g. Blue Waters files/min) Too much data Unpredictable I/O performance 19 19

20 Damaris: A Middleware-Level Approach to I/O on Multicore HPC Systems Idea : one dedicated I/O core per multicore node Originality : shared memory, asynchronous processing Implementation: software library Application: Tornado simulation (Blue Waters) Tests on Kraken (11th in Top PF) Scales on 10,000+ cores x12 less files Overhead-free compression Predictable performance 20

21 Damaris: Early Impact First results of the Joint Lab for Petascale Computing transfered to Blue Waters (2011) 2 nd Award for Matthieu Dorier at the ACM Student Research Competition (ICS 2011) «This work is practically very useful (and novel) to the field of computational meteorology and probably fluid dynamics. I think Damaris is going to be the best I/O option for these unprecedented supercomputing simulations.» Leigh Orf, atmospheric scientist, Central Michigan University 21

22 Scalable Storage and I/O on HPC Systems: Our 4-Year Strategy INRIA-ANL (FACCTS, PUF, EA) Damaris, BlobSeer Blue Waters JLPC Post-Petascale HPC ANR-JST FP3C Optimized I/O library for postpetascale HPC Middleware for storing multidimensional data Challenge: go beyond the limitations of current file-based approaches Explore dedicated I/O cores for in-situ visualization - collaboration : UIUC, USA Automatic analysis and image generation Adaptive image generation (recognize «interesting» data subsets) Interactive visualization Coupling with alternative approaches (I/O forwarding, etc.) collaboration : ANL, USA Collaboration with application communities Associate Team Data@Exascale with UIUC/ANL submitted ( ) 22 22

23 Longer-Term Directions Scalable storage and processing for structured data Structural/semantic metadata Collaboration with the database community (NoSQL, ) Performance modeling for scalable storage systems Capture interactions between system components Predict storage/transfer costs on clouds, self-optimization Collaboration with the scheduling community 23

HPC and Inria

HPC and Inria HPC and Clouds @ Inria F. Desprez Frederic.Desprez@inria.fr! Jun. 12, 2013 INRIA strategy in HPC/Clouds INRIA is among the HPC leaders in Europe Long history of researches around distributed systems, HPC,

More information

HPC at INRIA. Michel Cosnard INRIA President and CEO

HPC at INRIA. Michel Cosnard INRIA President and CEO 1 HPC at INRIA Michel Cosnard INRIA President and CEO French Institute for Research in Computer Science and Control 2 2 Information and Communication Sciences and Technologies Research Experiment Transfer

More information

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures Frédéric Suter Joint work with Gabriel Antoniu, Julien Bigot, Cristophe Blanchet, Luc

More information

Big Data: Tremendous challenges, great solutions

Big Data: Tremendous challenges, great solutions Big Data: Tremendous challenges, great solutions Luc Bougé ENS Rennes Alexandru Costan INSA Rennes Gabriel Antoniu INRIA Rennes Survive the data deluge! Équipe KerData 1 Big Data? 2 Big Picture The digital

More information

Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations

Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations Damaris: Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations Matthieu Dorier ENS Cachan Brittany extension matthieu.dorier@eleves.bretagne.ens-cachan.fr Advised by Gabriel Antoniu SRC

More information

Going further with Damaris: Energy/Performance Study in Post-Petascale I/O Approaches

Going further with Damaris: Energy/Performance Study in Post-Petascale I/O Approaches Going further with Damaris: Energy/Performance Study in Post-Petascale I/O Approaches Ma@hieu Dorier, Orçun Yildiz, Shadi Ibrahim, Gabriel Antoniu, Anne-Cécile Orgerie 2 nd workshop of the JLESC Chicago,

More information

Storage-Based Convergence Between HPC and Big Data

Storage-Based Convergence Between HPC and Big Data Storage-Based Convergence Between HPC and Big Data Pierre Matri*, Alexandru Costan, Gabriel Antoniu, Jesús Montes*, María S. Pérez* * Universidad Politécnica de Madrid, Madrid, Spain INSA Rennes / IRISA,

More information

Concurrency-Optimized I/O For Visualizing HPC Simulations: An Approach Using Dedicated I/O Cores

Concurrency-Optimized I/O For Visualizing HPC Simulations: An Approach Using Dedicated I/O Cores Concurrency-Optimized I/O For Visualizing HPC Simulations: An Approach Using Dedicated I/O Cores Ma#hieu Dorier, Franck Cappello, Marc Snir, Bogdan Nicolae, Gabriel Antoniu 4th workshop of the Joint Laboratory

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

Damaris. In-Situ Data Analysis and Visualization for Large-Scale HPC Simulations. KerData Team. Inria Rennes,

Damaris. In-Situ Data Analysis and Visualization for Large-Scale HPC Simulations. KerData Team. Inria Rennes, Damaris In-Situ Data Analysis and Visualization for Large-Scale HPC Simulations KerData Team Inria Rennes, http://damaris.gforge.inria.fr Outline 1. From I/O to in-situ visualization 2. Damaris approach

More information

Using Global Behavior Modeling to improve QoS in Cloud Data Storage Services

Using Global Behavior Modeling to improve QoS in Cloud Data Storage Services 2 nd IEEE International Conference on Cloud Computing Technology and Science Using Global Behavior Modeling to improve QoS in Cloud Data Storage Services Jesús Montes, Bogdan Nicolae, Gabriel Antoniu,

More information

From Damaris to CALCioM Mi/ga/ng I/O Interference in HPC Systems

From Damaris to CALCioM Mi/ga/ng I/O Interference in HPC Systems From Damaris to CALCioM Mi/ga/ng I/O Interference in HPC Systems Ma#hieu Dorier ENS Rennes, IRISA, Inria Rennes KerData project team Joint work with Rob Ross, Dries Kimpe, Gabriel Antoniu, Shadi Ibrahim

More information

Héméra Inria Large Scale Initiative

Héméra Inria Large Scale Initiative Héméra Inria Large Scale Initiative https://www.grid5000.fr/hemera Christian Perez Avalon and many co-authors Motivations Scientific issues Large scale, volatile, complex systems Performance, fault tolerance,

More information

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France Sky Computing on FutureGrid and Grid 5000 with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France Outline Introduction to Sky Computing The Nimbus Project

More information

A Performance and Energy Analysis of I/O Management Approaches for Exascale Systems

A Performance and Energy Analysis of I/O Management Approaches for Exascale Systems A Performance and Energy Analysis of Management Approaches for Exascale Systems Orcun Yildiz, Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu To cite this version: Orcun Yildiz, Matthieu Dorier, Shadi

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

SILECS Super Infrastructure for Large-scale Experimental Computer Science

SILECS Super Infrastructure for Large-scale Experimental Computer Science Super Infrastructure for Large-scale Experimental Computer Science Serge Fdida (UPMC) Frédéric Desprez (Inria) Christian Perez (Inria) INRIA, CNRS, RENATER, CEA, CPU, CDEFI, IMT, Sorbonne Universite, Universite

More information

Large Scale Sky Computing Applications with Nimbus

Large Scale Sky Computing Applications with Nimbus Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France Pierre.Riteau@irisa.fr INTRODUCTION TO SKY COMPUTING IaaS

More information

6a. Aula Parte1 2o. Período de 2013

6a. Aula Parte1 2o. Período de 2013 6a. Aula Parte1 2o. Período de 2013 Livro texto Cloud Applications Science and Technical Applications Scientific/Tech Applications Business Applications Consumer/Social Applications Consumer/Social Applications

More information

Internet of Things specialization at Institut Mines-Télécom / Télécom Bretagne. Rennes campus, France

Internet of Things specialization at Institut Mines-Télécom / Télécom Bretagne. Rennes campus, France Internet of Things specialization at Institut Mines-Télécom / Télécom Bretagne Rennes campus, France 2 About Institut Mines-Télécom About Télécom Bretagne! A Graduate Engineering School & Research Centre

More information

Towards a Self-Adaptive Data Management System for Cloud Environments

Towards a Self-Adaptive Data Management System for Cloud Environments Towards a Self-Adaptive Data Management System for Cloud Environments Alexandra Carpen-Amarie To cite this version: Alexandra Carpen-Amarie. Towards a Self-Adaptive Data Management System for Cloud Environments.

More information

Big Data Specialized Studies

Big Data Specialized Studies Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

HPC learning using Cloud infrastructure

HPC learning using Cloud infrastructure HPC learning using Cloud infrastructure Florin MANAILA IT Architect florin.manaila@ro.ibm.com Cluj-Napoca 16 March, 2010 Agenda 1. Leveraging Cloud model 2. HPC on Cloud 3. Recent projects - FutureGRID

More information

BlobSeer: Bringing High Throughput under Heavy Concurrency to Hadoop Map/Reduce Applications

BlobSeer: Bringing High Throughput under Heavy Concurrency to Hadoop Map/Reduce Applications Author manuscript, published in "24th IEEE International Parallel and Distributed Processing Symposium (IPDPS 21) (21)" DOI : 1.119/IPDPS.21.547433 BlobSeer: Bringing High Throughput under Heavy Concurrency

More information

A-Brain and Z-CloudFlow: Scalable Data Processing on Azure Clouds Lessons Learned in 3 Years and Future Directions

A-Brain and Z-CloudFlow: Scalable Data Processing on Azure Clouds Lessons Learned in 3 Years and Future Directions A-Brain and Z-CloudFlow: Scalable Data Processing on Azure Clouds Lessons Learned in 3 Years and Future Directions A-Brain Project PIs: Gabriel Antoniu, Bertrand Thirion Contributors: Alexandru Costan,

More information

A Cloud Framework for Big Data Analytics Workflows on Azure

A Cloud Framework for Big Data Analytics Workflows on Azure A Cloud Framework for Big Data Analytics Workflows on Azure Fabrizio MAROZZO a, Domenico TALIA a,b and Paolo TRUNFIO a a DIMES, University of Calabria, Rende (CS), Italy b ICAR-CNR, Rende (CS), Italy Abstract.

More information

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society.

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society. Join the Innovation Qatar Computing Research Institute (QCRI) is a national research institute established in 2010 by Qatar Foundation for Education, Science and Community Development. As a primary constituent

More information

CC-IN2P3: A High Performance Data Center for Research

CC-IN2P3: A High Performance Data Center for Research April 15 th, 2011 CC-IN2P3: A High Performance Data Center for Research Toward a partnership with DELL Dominique Boutigny Agenda Welcome Introduction to CC-IN2P3 Visit of the computer room Lunch Discussion

More information

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University Building Effective CyberGIS: FutureGrid Marlon Pierce, Geoffrey Fox Indiana University Some Worthy Characteristics of CyberGIS Open Services, algorithms, data, standards, infrastructure Reproducible Can

More information

e-infrastructures in FP7: Call 7 (WP 2010)

e-infrastructures in FP7: Call 7 (WP 2010) e-infrastructures in FP7: Call 7 (WP 2010) Call 7 Preliminary information on the call for proposals FP7-INFRASTRUCTURES-2010-2 (Call 7) subject to approval of the Research Infrastructures WP 2010 About

More information

Grid Computing a new tool for science

Grid Computing a new tool for science Grid Computing a new tool for science CERN, the European Organization for Nuclear Research Dr. Wolfgang von Rüden Wolfgang von Rüden, CERN, IT Department Grid Computing July 2006 CERN stands for over 50

More information

Ian Foster, An Overview of Distributed Systems

Ian Foster, An Overview of Distributed Systems The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,

More information

Some Reflections on Advanced Geocomputations and the Data Deluge

Some Reflections on Advanced Geocomputations and the Data Deluge Some Reflections on Advanced Geocomputations and the Data Deluge J. A. Rod Blais Dept. of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, AB www.ucalgary.ca/~blais

More information

The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D

The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D Marcel Kunze Steinbuch Centre for Computing (SCC) Karlsruhe Institute of Technology (KIT) Germany KIT The cooperation of Forschungszentrum

More information

High Performance Computing from an EU perspective

High Performance Computing from an EU perspective High Performance Computing from an EU perspective DEISA PRACE Symposium 2010 Barcelona, 10 May 2010 Kostas Glinos European Commission - DG INFSO Head of Unit GÉANT & e-infrastructures 1 "The views expressed

More information

High Performance Computing Course Notes Course Administration

High Performance Computing Course Notes Course Administration High Performance Computing Course Notes 2009-2010 2010 Course Administration Contacts details Dr. Ligang He Home page: http://www.dcs.warwick.ac.uk/~liganghe Email: liganghe@dcs.warwick.ac.uk Office hours:

More information

Huawei European Research University Partnerships. Michael Hill-King European Research Institute, Huawei

Huawei European Research University Partnerships. Michael Hill-King European Research Institute, Huawei Huawei European Research University Partnerships Michael Hill-King European Research Institute, Huawei Next 20 30 Years: The World Will Become Intelligent All things Sensing All things Connected All things

More information

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP ISSN: 0976-2876 (Print) ISSN: 2250-0138 (Online) PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP T. S. NISHA a1 AND K. SATYANARAYAN REDDY b a Department of CSE, Cambridge

More information

CANARIE: Providing Essential Digital Infrastructure for Canada

CANARIE: Providing Essential Digital Infrastructure for Canada CANARIE: Providing Essential Digital Infrastructure for Canada Mark Wolff; CTO April 16, 2014 A Transformation of the Science Paradigm thousands of years ago last few hundred years last few decades today

More information

e-infrastructures in FP7 INFO DAY - Paris

e-infrastructures in FP7 INFO DAY - Paris e-infrastructures in FP7 INFO DAY - Paris Carlos Morais Pires European Commission DG INFSO GÉANT & e-infrastructure Unit 1 Global challenges with high societal impact Big Science and the role of empowered

More information

WORK PROGRAMME

WORK PROGRAMME WORK PROGRAMME 2014 2015 Topic ICT 9: Tools and Methods for Software Development Michel LACROIX European Commission DG CONNECT Software & Services, Cloud michel.lacroix@ec.europa.eu From FP7 to H2020 Preparation

More information

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions Pradeep Sivakumar pradeep-sivakumar@northwestern.edu Contents What is XSEDE? Introduction Who uses XSEDE?

More information

Mobile Cloud Computing

Mobile Cloud Computing MTAT.03.262 -Mobile Application Development Lecture 8 Mobile Cloud Computing Satish Srirama, Huber Flores satish.srirama@ut.ee Outline Cloud Computing Mobile Cloud Access schemes HomeAssignment3 10/20/2014

More information

HPC Cloud at SURFsara

HPC Cloud at SURFsara HPC Cloud at SURFsara Offering cloud as a service SURF Research Boot Camp 21st April 2016 Ander Astudillo Markus van Dijk What is cloud computing?

More information

H2020 EUB EU-Brazil Research and Development Cooperation in Advanced Cyber Infrastructure. NCP Training Brussels, 18 September 2014

H2020 EUB EU-Brazil Research and Development Cooperation in Advanced Cyber Infrastructure. NCP Training Brussels, 18 September 2014 H2020 EUB 2015 EU-Brazil Research and Development Cooperation in Advanced Cyber Infrastructure NCP Training Brussels, 18 September 2014 H2020 EUB 2015 This topic is a major element for the implementation

More information

High Performance Computing Course Notes HPC Fundamentals

High Performance Computing Course Notes HPC Fundamentals High Performance Computing Course Notes 2008-2009 2009 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi

Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi Lecture - 01 Introduction to Parallel Computing Architectures

More information

Mobile and Cloud Computing Seminar

Mobile and Cloud Computing Seminar Mobile and Cloud Computing Seminar MTAT.03.280 Fall 2013 Satish Srirama satish.srirama@ut.ee Course Purpose To have a platform to discuss the research developments of Mobile Cloud Lab Introduce students

More information

On Component Models to Deploy Application on Clouds

On Component Models to Deploy Application on Clouds On Component Models to Deploy Application on Clouds Christian Perez Avalon Team LIP, ENS Lyon, France 9 th workshop of the Joint Laboratory for Petascale Computation Lyon 2013, June 12-14 Deploying an

More information

Pervasive DataRush TM

Pervasive DataRush TM Pervasive DataRush TM Parallel Data Analysis with KNIME www.pervasivedatarush.com Company Overview Global Software Company Tens of thousands of users across the globe Americas, EMEA, Asia ~230 employees

More information

VISION Virtualized Storage Services Foundation for the Future Internet

VISION Virtualized Storage Services Foundation for the Future Internet VISION Virtualized Storage Services Foundation for the Future Internet Julian Satran, Hillel Kolodner, Dalit Naor *, Yaron Wolfsthal IBM, On Behalf of the VISION Consortium SNIA Cloud Storage Mini Summit

More information

EURITRACK Workshop. Session 2 : Industrial and academic views. CEA LIST vision. R. Cammoun

EURITRACK Workshop. Session 2 : Industrial and academic views. CEA LIST vision. R. Cammoun EURITRACK Workshop Session 2 : Industrial and academic views CEA LIST vision R. Cammoun riadh.cammoun@cea.fr A French government-funded technological research organisation A prominent player in the European

More information

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research Dr Paul Calleja Director of Research Computing University of Cambridge Global leader in science & technology

More information

HPC IN EUROPE. Organisation of public HPC resources

HPC IN EUROPE. Organisation of public HPC resources HPC IN EUROPE Organisation of public HPC resources Context Focus on publicly-funded HPC resources provided primarily to enable scientific research and development at European universities and other publicly-funded

More information

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics, Pervasive Technology Institute Indiana University

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity The Social Grid Leveraging the Power of the Web and Focusing on Development Simplicity Tony Hey Corporate Vice President of Technical Computing at Microsoft TCP/IP versus ISO Protocols ISO Committees disconnected

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

Technologies for Information and Health

Technologies for Information and Health Energy Defence and Global Security Technologies for Information and Health Atomic Energy Commission HPC in France from a global perspective Pierre LECA Head of «Simulation and Information Sciences Dpt.»

More information

PERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015

PERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015 PERFORMANCE PORTABILITY WITH OPENACC Jeff Larkin, NVIDIA, November 2015 TWO TYPES OF PORTABILITY FUNCTIONAL PORTABILITY PERFORMANCE PORTABILITY The ability for a single code to run anywhere. The ability

More information

High Performance Computing Cloud - a PaaS Perspective

High Performance Computing Cloud - a PaaS Perspective a PaaS Perspective Supercomputer Education and Research Center Indian Institute of Science, Bangalore November 2, 2015 Overview Cloud computing is emerging as a latest compute technology Properties of

More information

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Ji Liu 1,2,3, Luis Pineda 1,2,4, Esther Pacitti 1,2,3, Alexandru Costan 4, Patrick Valduriez 1,2,3, Gabriel Antoniu

More information

Interconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments

Interconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments Interconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments H. Astsatryan Institute for Informatics and Automation Problems, National Academy of Sciences of the Republic

More information

EGI: Linking digital resources across Eastern Europe for European science and innovation

EGI: Linking digital resources across Eastern Europe for European science and innovation EGI- InSPIRE EGI: Linking digital resources across Eastern Europe for European science and innovation Steven Newhouse EGI.eu Director 12/19/12 EPE 2012 1 EGI European Over 35 countries Grid Secure sharing

More information

At the heart of Europe s ICT ecosystem

At the heart of Europe s ICT ecosystem At the heart of Europe s ICT ecosystem Renato Lombardi VP of European Research Center A permanently connected World - where communication and information is merging into ICT 5G - era of Hyper Connectivity

More information

New research on Key Technologies of unstructured data cloud storage

New research on Key Technologies of unstructured data cloud storage 2017 International Conference on Computing, Communications and Automation(I3CA 2017) New research on Key Technologies of unstructured data cloud storage Songqi Peng, Rengkui Liua, *, Futian Wang State

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi GRIDS INTRODUCTION TO GRID INFRASTRUCTURES Fabrizio Gagliardi Dr. Fabrizio Gagliardi is the leader of the EU DataGrid project and designated director of the proposed EGEE (Enabling Grids for E-science

More information

Mobile Cloud Computing

Mobile Cloud Computing MTAT.03.262 Mobile Application Development Mobile Cloud Computing Satish Srirama, Huber Flores satish.srirama@ut.ee Tartu, Estonia, 2013 Outline Cloud Computing Mobile Cloud Access schemas Research challenges

More information

Windows Azure Overview

Windows Azure Overview Windows Azure Overview Christine Collet, Genoveva Vargas-Solar Grenoble INP, France MS Azure Educator Grant Packaged Software Infrastructure (as a Service) Platform (as a Service) Software (as a Service)

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science CSD3 The Cambridge Service for Data Driven Discovery A New National HPC Service for Data Intensive science Dr Paul Calleja Director of Research Computing University of Cambridge Problem statement Today

More information

Workshop: Innovation Procurement in Horizon 2020 PCP Contractors wanted

Workshop: Innovation Procurement in Horizon 2020 PCP Contractors wanted Workshop: Innovation Procurement in Horizon 2020 PCP Contractors wanted Supercomputing Centre Institute for Advanced Simulation / FZJ 1 www.prace-ri.eu Challenges: Aging Society Energy Food How we can

More information

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products What is big data? Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

More information

EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA

EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA Leonardo Flores Añover Senior Expert - HPC and Quantum technologies DG CONNECT European Commission Overall

More information

Federating FutureGrid and GENI

Federating FutureGrid and GENI Federating FutureGrid and GENI Ilia Baldine (ibaldin@renci.org RENCI/ UNC CH), Jeff Chase (chase@cs.duke.edu, Duke University), Geoffrey Fox (gcfexchange@gmail.com IU), Gregor von Laszewski (laszewski@gmail.com

More information

The FIRST Project and the Future Internet National Research Programme

The FIRST Project and the Future Internet National Research Programme The FIRST Project and the Future Internet National Research Programme Prof. Gyula Sallai DSc Future Internet National Technology Platform (FI NTP) Future Internet Research Coordination Centre (FIRCC) Budapest

More information

Digitising European industry

Digitising European industry Digitising European industry LETI Innovation Days 50 th Anniversary 28 June 2017 Dr. Max Lemke, European Commission, DG CONNECT Head of Unit Technologies & Systems for Digitising Industry #DigitiseEU CEA-LETI

More information

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University Lightweight Streaming-based Runtime for Cloud Computing granules Shrideep Pallickara Community Grids Lab, Indiana University A unique confluence of factors have driven the need for cloud computing DEMAND

More information

BlobSeer: Next Generation Data Management for Large Scale Infrastructures

BlobSeer: Next Generation Data Management for Large Scale Infrastructures Author manuscript, published in "Journal of Parallel and Distributed Computing 71, 2 (2011) 168-184" DOI : 10.1016/j.jpdc.2010.08.004 BlobSeer: Next Generation Data Management for Large Scale Infrastructures

More information

Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems?

Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems? Co-existence: Can Big Data and Big Computation Co-exist on the Same Systems? Dr. William Kramer National Center for Supercomputing Applications, University of Illinois Where these views come from Large

More information

Big Data, exploiter de grands volumes de données

Big Data, exploiter de grands volumes de données Big Data, exploiter de grands volumes de données mardi 3 juillet 2012 Daniel Teruggi, Head of Research dteruggi@ina.fr Ina: Institut National de l Audiovisuel Institut national de l audiovisuel Missions:

More information

Rutgers Discovery Informatics Institute (RDI2)

Rutgers Discovery Informatics Institute (RDI2) Rutgers Discovery Informatics Institute (RDI2) Manish Parashar h+p://rdi2.rutgers.edu Modern Science & Society Transformed by Compute & Data The era of Extreme Compute and Big Data New paradigms and prac3ces

More information

Database Management Systems

Database Management Systems Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who?

More information

The EuroHPC strategic initiative

The EuroHPC strategic initiative Amsterdam, 12 December 2017 The EuroHPC strategic initiative Thomas Skordas Director, DG CONNECT-C, European Commission The European HPC strategy in Horizon 2020 Infrastructure Capacity of acquiring leadership-class

More information

Distributed Data Storage in Support for Context-Aware Applications

Distributed Data Storage in Support for Context-Aware Applications Distributed Data Storage in Support for Context-Aware Applications Elena Burceanu, Ciprian Dobre, Valentin Cristea, Alexandru Costan, Gabriel Antoniu University Politehnica of Bucharest Bucharest, Romania

More information

From Multi-Cores to Clouds with ProActive Parallel Suite: UC in Biotech, IT, Finance, and Engineering

From Multi-Cores to Clouds with ProActive Parallel Suite: UC in Biotech, IT, Finance, and Engineering From Multi-Cores to Clouds with ProActive Parallel Suite: UC in Biotech, IT, Finance, and Engineering Agenda 1. Background: INRIA, ActiveEon 2. CLOUD Computing 3. ProActive Parallel Suite Programming,

More information

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure An IDC InfoBrief, Sponsored by IBM April 2018 Executive Summary Today s healthcare organizations

More information

PROJECT FINAL REPORT. Tel: Fax:

PROJECT FINAL REPORT. Tel: Fax: PROJECT FINAL REPORT Grant Agreement number: 262023 Project acronym: EURO-BIOIMAGING Project title: Euro- BioImaging - Research infrastructure for imaging technologies in biological and biomedical sciences

More information

High Throughput Data-Compression for Cloud Storage

High Throughput Data-Compression for Cloud Storage High Throughput Data-Compression for Cloud Storage Bogdan Nicolae To cite this version: Bogdan Nicolae. High Throughput Data-Compression for Cloud Storage. Globe 10: Proceedings of the 3rd International

More information

Science-as-a-Service

Science-as-a-Service Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

The CEDA Archive: Data, Services and Infrastructure

The CEDA Archive: Data, Services and Infrastructure The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Information and Communication Technologies (ICT) thematic area

Information and Communication Technologies (ICT) thematic area Information and Communication Technologies (ICT) thematic area Network & Service Infrastructures 1.1 The Network of the Future (Call 4, Call 5) 2009-2010 a) Future Internet Architectures and Network Technologies

More information

High Performance Computing in C and C++

High Performance Computing in C and C++ High Performance Computing in C and C++ Rita Borgo Computer Science Department, Swansea University WELCOME BACK Course Administration Contact Details Dr. Rita Borgo Home page: http://cs.swan.ac.uk/~csrb/

More information

Ian Foster, CS554: Data-Intensive Computing

Ian Foster, CS554: Data-Intensive Computing The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,

More information

Collaborations and Partnerships. John Broome CODATA-International

Collaborations and Partnerships. John Broome CODATA-International Collaborations and Partnerships John Broome CODATA-International 1 CODATA-International One of the fundamental roles of CODATA is to facilitate collaboration and partnerships in the management of research

More information

A digital resources warehouse as a playground for the future Internet

A digital resources warehouse as a playground for the future Internet A digital resources warehouse as a playground for the future Internet Eric Fleury, ENS de Lyon / Inria Serge Fdida, UPMC Sorbonne Universités & CNRS Réunion TGIR 31 Mars 2016 Digital transformation Scientific

More information