Exploring Many Task Computing in Scientific Workflows

Size: px
Start display at page:

Download "Exploring Many Task Computing in Scientific Workflows"

Transcription

1 Exploring Many Task Computing in Scientific Workflows Eduardo Ogasawara Daniel de Oliveira Fernando Seabra Carlos Barbosa Renato Elias Vanessa Braganholo Alvaro Coutinho Marta Mattoso Federal University of Rio de Janeiro, Brazil Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MTAGS '09 November 16th, 2009, Portland, Oregon, USA Copyright 2009 ACM /09/11... $10.00 MTAGS

2 Agenda Introduction o Scientific experiments o Scientific workflows o Experiments life cycle Hydra middleware Case study Related work Conclusion MTAGS

3 Typical scenario: scientific experiment 2. Data analyzed by program X 1. Data collection 3. Large Volume of Data Produced Results are analyzed by program Z 4....which need to be processed by program Y in a cluster MTAGS

4 Variations of data or parameters 2. Data analyzed by program X 1. Data collection 3. Large Volume of Data Produced Results are analyzed by program Z 4....which need to be processed by program Y in a MTC environment MTAGS

5 Current solutions Scientific Workflow Management Systems (SWfMS) SWfMS allow the execution of Scientific Workflows o Some SWfMS are strong in workflow design and provenance support (VisTrails, Kepler, Taverna) o Some SWfMS are strong in HPC support (Pegasus, Swift, Triana) Scientists should be free to choose the SWfMS that suits best for their needs This choice should not prevent the adoption of an MTC solution for executing one or more activities of a workflow MTAGS

6 Parallelization difficulties Controlling parallel execution in distributed environments Steering activities in distributed environments Provenance gathering in distributed/ heterogeneous environments MTAGS

7 Provenance can support analyzing scientific experiments Before execution: o What programs may be used? Is there any alternative to explore? o Is there any dependency between activities? Which activities are mandatory? After execution: o What were the parameters that lead the best result? o What was the scientific workflow that lead to the desired result? o Where are the output files generated by the distributed activity A using the parameters P? o How many times the activity A in version V was used in the experiment E? MTAGS

8 Our vision of the experiment life cycle GExpLine tool support s the experiment life cycle Composition Conception Reuse Provenance Data Analysis Query Discovery Distribution Monitoring Execution SWfMS Hydra HPC MTAGS

9 Hydra Middleware solution that bridges the SWfMS to the HPC supporting MTC parallelization strategies SWfMS Hydra Middleware HPC Environment Goal: reduce the complexity involved in designing and managing activity/workflow parallel executions while gathering distributed provenance data MTAGS

10 Supported parallelization types Data Input Data Parameters Parameter Sweep Data Fragmentation Parameter Sweep I 1 I n Pt 1 Pt n Activity/ Wf Activity/ Wf Parameters Activity/ Wf Activity/ Wf Data Input O 1 O n O 1 O n Data Analysis Data Analysis Data Output Data Output MTAGS

11 Hydra Architecture Hydra Setup Hydra MTC Layer Hydra Setup Configuration MUX Workflow Hydra Client Components Parameter Sweeper Workspace Handler Hydra Preprocessing Data Fragmenter Cartridge PBS Falkon Scheduler Uploader Swift Dispatcher Gatherer Downloader Hydra Dispatcher / Monitor Dispatcher Monitor VisTrails SWfMS Hydra External Components Hydra Post-processing Provenance Data Analyzer Cartridge Client Layer MTC Environment Storage Control Data MTAGS

12 Hydra setup Hydra Setup MTAGS

13 Hydra client components MTAGS

14 Hydra pre-processing components Parameter Sweeper Workspace Handler Data Fragmenter Cartridge Pre-Processing MTAGS

15 Hydra dispatcher/monitor components Dispatcher Monitor MTC Processing MTAGS

16 Hydra post-processing components Provenance Data Analyzer Cartridge Post-Processing MTAGS

17 Hydra Architecture Hydra MTC Layer Hydra Setup Configuration MUX Workflow Parameter Sweeper Workspace Handler Data Fragmenter Cartridge PBS Falkon Scheduler Hydra Client Componen nts Uploader Dispatcher Gatherer Downloader Pre-Processing Dispatcher Monitor MTC Processing Swift VisTrails SWfMS Provenance Data Analyzer Cartridge Post-Processing Client Layer MTC Environment Storage Control Data MTAGS

18 Case study Computational Fluid Dynamics (CFD) EdgeCFD: a parallel stabilized finite element incompressible flow solver Synthesized in four steps: omodeling o Preprocessing osolution o TAU parallel profiling of CFD solver on SGI Altix ICE 8200, 128 cores MTAGS

19 nn.part.msh velo_nnnn.vecnn part.mat press_0000_sdnn part.ic scal_nnnn_sdnn part.edg DD_nnnn_sdnn EdgeCFD experiment life cycle file nn.part.in file <<Automated>> EdgeCFD Preprocessor file file <<Sub-Workflow, Sweep>> EdgeCFD Solver and Control Applications File Composition file.case file nn.geo file file file file <<Semi-Automated>> Conception Reuse Analysis Query Discovery Provenance Data Distribution Monitoring Execution VisTrails & Hydra MTAGS

20 Workflow modeled in UML <<Automated>> EdgeCFD Preprocessor Pre-processing file nn.part.in file nn.part.msh file part.mat file part.ic File part.edg <<Sub-Workflow, Sweep>> EdgeCFD Solver and Control Applications solver file.case file nn.geo file velo_nnnn.vecnn file press_0000_sdnn file scal_nnnn_sdnn file DD_nnnn_sdnn <<Semi-Automated>> visualization MTAGS

21 Sequential workflow Pre-processing solver visualization MTAGS

22 Parameter sweep scenario MTAGS

23 Workflow with parameter sweep using Hydra Pre-processing solver visualization MTAGS

24 Hydra client setup for the solver activity MTAGS

25 Instrumentation of files for the experiment MTAGS

26 Hydra provenance MTAGS

27 Evaluation of a small experiment MTAGS

28 Related work Swift/Falkon o Provides MTC support from Swift SWfMS MyCluster osupports PBS with transient fault support over remote sites Dryad osupports data parallelization with high scalability Sawzal oit is a framework for MTC that explore data parallelism MTAGS

29 Conclusions Experiments life cycle must be managed as a whole: o Composition: experiment is modeled in a workflow abstraction level until being deployed into a specific SWfMS o Execution: some activities demand HPC with monitoring facilities and provenance gathering o Analysis: uses both information from the composition (prospective provenance) and from execution (local and distributed - retrospective provenance) Hydra can be a bridge between the SWfMS and the HPC environment o Supports workflow data and parameter sweep parallelization o Evaluated in a real case CFD solver with little overhead o Supports distributed provenance gathering MTAGS

30 Future work Evaluate different kinds of applications (e.g. blast, uncertainty quantification ) Model distributed activities that are actually subworkflows Run experiments in HPC with more cores MTAGS

31 Exploring Many Task Computing in Scientific Workflows Eduardo Ogasawara Fernando Seabra Renato Elias Alvaro Coutinho Thank you! Daniel de Oliveira Carlos Barbosa Vanessa Braganholo Marta Mattoso Federal University of Rio de Janeiro, Brazil Please visit oursite MTAGS

Raw data queries during data-intensive parallel workflow execution

Raw data queries during data-intensive parallel workflow execution Raw data queries during data-intensive parallel workflow execution Vítor Silva, José Leite, José Camata, Daniel De Oliveira, Alvaro Coutinho, Patrick Valduriez, Marta Mattoso To cite this version: Vítor

More information

Scien&fic Experiments as Workflows and Scripts. Vanessa Braganholo

Scien&fic Experiments as Workflows and Scripts. Vanessa Braganholo Scien&fic Experiments as Workflows and Scripts Vanessa Braganholo The experiment life cycle Composition Concep&on Reuse Analysis Query Discovery Visualiza&on Provenance Data Distribu&on Monitoring Execution

More information

Enabling In Situ Viz and Data Analysis with Provenance in libmesh

Enabling In Situ Viz and Data Analysis with Provenance in libmesh Enabling In Situ Viz and Data Analysis with Provenance in libmesh Vítor Silva Jose J. Camata Marta Mattoso Alvaro L. G. A. Coutinho (Federal university Of Rio de Janeiro/Brazil) Patrick Valduriez (INRIA/France)

More information

Data-Centric Iteration in Dynamic Workflows

Data-Centric Iteration in Dynamic Workflows Data-Centric Iteration in Dynamic Workflows Jonas Dias a, Gabriel Guerra a, Fernando Rochinha a, Alvaro L.G.A. Coutinho a, Patrick Valduriez b, Marta Mattoso a a COPPE - Federal University of Rio de Janeiro,

More information

Scientific Workflow Scheduling with Provenance Support in Multisite Cloud

Scientific Workflow Scheduling with Provenance Support in Multisite Cloud Scientific Workflow Scheduling with Provenance Support in Multisite Cloud Ji Liu 1, Esther Pacitti 1, Patrick Valduriez 1, and Marta Mattoso 2 1 Inria, Microsoft-Inria Joint Centre, LIRMM and University

More information

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud

Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Efficient Scheduling of Scientific Workflows using Hot Metadata in a Multisite Cloud Ji Liu 1,2,3, Luis Pineda 1,2,4, Esther Pacitti 1,2,3, Alexandru Costan 4, Patrick Valduriez 1,2,3, Gabriel Antoniu

More information

Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne

Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne Wellington Oliveira 1, 2, Paolo Missier 3, Daniel de Oliveira 1, Vanessa Braganholo 1 1 Instituto de Computação,

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones 2 1 San Diego Supercomputer Center, UCSD, U.S.A.

More information

How much domain data should be in provenance databases?

How much domain data should be in provenance databases? Daniel de Oliveira Instituto de Computação Universidade Federal Fluminense Niterói, Brazil danielcmo@ic.uff.br How much domain data should be in provenance databases? Vítor Silva COPPE Federal University

More information

Evaluating Parameter Sweep Workflows in High Performance Computing *

Evaluating Parameter Sweep Workflows in High Performance Computing * Evaluating Parameter Sweep Workflows in High Performance Computing * Fernando Chirigati 1,# Eduardo Ogasawara 2 Jonas Dias 1 Patrick Valduriez 4 Vítor Silva 1 Daniel de Oliveira 1 Fábio Porto 3 Marta Mattoso

More information

Scientific Workflows

Scientific Workflows Scientific Workflows Overview More background on workflows Kepler Details Example Scientific Workflows Other Workflow Systems 2 Recap from last time Background: What is a scientific workflow? Goals: automate

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Dynamic Clustering in WiFi Direct Technology

Dynamic Clustering in WiFi Direct Technology Dynamic Clustering in WiFi Direct Technology Urbano Botrel Menegato urbanobm@gmail.com Leonardo de S. Cimino leonardocimino@gmail.com Fernando A. Medeiros Joubert de Castro Lima Silva HPC Lab fernandoaugusto@gmail.com

More information

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap

More information

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB Grids and Workflows Overview Scientific workflows and Grids Taxonomy Example systems Kepler revisited Data Grids Chimera GridDB 2 Workflows and Grids Given a set of workflow tasks and a set of resources,

More information

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects

More information

Capturing and Querying Workflow Runtime Provenance with PROV: a Practical Approach

Capturing and Querying Workflow Runtime Provenance with PROV: a Practical Approach Capturing and Querying Workflow Runtime Provenance with PROV: a Practical Approach Flavio Costa 1, Vítor Silva 1, Daniel de Oliveira 1, Kary Ocaña 1, Eduardo Ogasawara 1,2, Jonas Dias 1 and Marta Mattoso

More information

Carelyn Campbell, Ben Blaiszik, Laura Bartolo. November 1, 2016

Carelyn Campbell, Ben Blaiszik, Laura Bartolo. November 1, 2016 Carelyn Campbell, Ben Blaiszik, Laura Bartolo November 1, 2016 Data Landscape Collaboration Tools (e.g. Google Drive, DropBox, Sharepoint, Github, MatIN) Data Sharing Communities (e.g. Dryad, FigShare,

More information

Extending Choreography Spheres to Improve Simulations

Extending Choreography Spheres to Improve Simulations Institute of Architecture of Application Systems Extending Choreography Spheres to Improve Simulations Oliver Kopp, Katharina Görlach, Frank Leymann Institute of Architecture of Application Systems, University

More information

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Jianwu Wang, Ilkay Altintas Scientific Workflow Automation Technologies Lab SDSC, UCSD project.org UCGrid Summit,

More information

Dimensioning the Virtual Cluster for Parallel Scientific Workflows in Clouds

Dimensioning the Virtual Cluster for Parallel Scientific Workflows in Clouds Dimensioning the Virtual Cluster for Parallel Scientific Workflows in Clouds Daniel de Oliveira 1, Vitor Viana 3 1 IC/UFF danielcmo@ic.uff.br Eduardo Ogasawara 2 2 CEFET/RJ eogasawara@cefet-rj.br Kary

More information

Tools: Versioning. Dr. David Koop

Tools: Versioning. Dr. David Koop Tools: Versioning Dr. David Koop Tools We have seen specific tools that address particular topics: - Versioning and Sharing: Git, Github - Data Availability and Citation: DOIs, Dryad, DataONE, figshare

More information

General Purpose GPU Programming. Advanced Operating Systems Tutorial 9

General Purpose GPU Programming. Advanced Operating Systems Tutorial 9 General Purpose GPU Programming Advanced Operating Systems Tutorial 9 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous

More information

A characterization of workflow management systems for extreme-scale applications

A characterization of workflow management systems for extreme-scale applications Accepted Manuscript A characterization of workflow management systems for extreme-scale applications Rafael Ferreira da Silva, Rosa Filgueira, Ilia Pietri, Ming Jiang, Rizos Sakellariou, Ewa Deelman PII:

More information

Case Studies in Storage Access by Loosely Coupled Petascale Applications

Case Studies in Storage Access by Loosely Coupled Petascale Applications Case Studies in Storage Access by Loosely Coupled Petascale Applications Justin M Wozniak and Michael Wilde Petascale Data Storage Workshop at SC 09 Portland, Oregon November 15, 2009 Outline Scripted

More information

Data Management in Parallel Scripting

Data Management in Parallel Scripting Data Management in Parallel Scripting Zhao Zhang 11/11/2012 Problem Statement Definition: MTC applications are those applications in which existing sequential or parallel programs are linked by files output

More information

ISSN: Supporting Collaborative Tool of A New Scientific Workflow Composition

ISSN: Supporting Collaborative Tool of A New Scientific Workflow Composition Abstract Supporting Collaborative Tool of A New Scientific Workflow Composition Md.Jameel Ur Rahman*1, Akheel Mohammed*2, Dr. Vasumathi*3 Large scale scientific data management and analysis usually relies

More information

Confucius: A Tool Supporting Collaborative Scientific Workflow Composition

Confucius: A Tool Supporting Collaborative Scientific Workflow Composition Carnegie Mellon University From the SelectedWorks of Jia Zhang January, 2014 Confucius: A Tool Supporting Collaborative Scientific Workflow Composition Jia Zhang Daniel Kuc Shiyong Lu Available at: https://works.bepress.com/jia_zhang/2/

More information

Parallelization of Scientific Workflows in the Cloud

Parallelization of Scientific Workflows in the Cloud Parallelization of Scientific Workflows in the Cloud Ji Liu, Esther Pacitti, Patrick Valduriez, Marta Mattoso To cite this version: Ji Liu, Esther Pacitti, Patrick Valduriez, Marta Mattoso. Parallelization

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Financial Dataspaces: Challenges, Approaches and Trends

Financial Dataspaces: Challenges, Approaches and Trends Financial Dataspaces: Challenges, Approaches and Trends Finance and Economics on the Semantic Web (FEOSW), ESWC 27 th May, 2012 Seán O Riain ebusiness Copyright 2009. All rights reserved. Motivation Changing

More information

Managing Exploratory Workflows

Managing Exploratory Workflows Managing Exploratory Workflows Juliana Freire Claudio T. Silva http://www.sci.utah.edu/~vgc/vistrails/ University of Utah Joint work with: Erik Andersen, Steven P. Callahan, David Koop, Emanuele Santos,

More information

User-Steering on Cloud Workflows

User-Steering on Cloud Workflows Latin American escience Wrkshp 13 User-Steering n Clud Wrkflws Marta Matts COPPE/Federal University f Ri de Janeir Latam 13 Turning Data int Insight Agenda Life Cycle f Scientific Wrkflws User Steering

More information

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A.

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A. Accelerating Parameter Sweep Workflows by Utilizing i Ad-hoc Network Computing Resources: an Ecological Example Jianwu Wang 1, Ilkay Altintas 1, Parviez R. Hosseini 2, Derik Barseghian 2, Daniel Crawl

More information

Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008

Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008 Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008 I. Assessing Kepler/CORE development priorities The first Kepler Stakeholder s meeting brought

More information

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,

More information

General Purpose GPU Programming. Advanced Operating Systems Tutorial 7

General Purpose GPU Programming. Advanced Operating Systems Tutorial 7 General Purpose GPU Programming Advanced Operating Systems Tutorial 7 Tutorial Outline Review of lectured material Key points Discussion OpenCL Future directions 2 Review of Lectured Material Heterogeneous

More information

The GUISurfer tool: towards a language independent approach to reverse engineering GUI code

The GUISurfer tool: towards a language independent approach to reverse engineering GUI code The GUISurfer tool: towards a language independent approach to reverse engineering GUI code João Carlos Silva jcsilva@ipca.pt João Saraiva jas@di.uminho.pt Carlos Silva carlosebms@gmail.com Departamento

More information

Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud

Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud Ji Liu, Esther Pacitti, Patrick Valduriez, Marta Mattoso To cite this version: Ji Liu, Esther Pacitti, Patrick Valduriez, Marta

More information

arxiv: v1 [cs.dc] 11 Jan 2018

arxiv: v1 [cs.dc] 11 Jan 2018 BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments arxiv:1801.03915v1 [cs.dc] 11 Jan 2018 Maria Luiza Mondelli 1 Thiago Magalhães 1 Guilherme Loss 1 Michael

More information

Complex Workloads on HUBzero Pegasus Workflow Management System

Complex Workloads on HUBzero Pegasus Workflow Management System Complex Workloads on HUBzero Pegasus Workflow Management System Karan Vahi Science Automa1on Technologies Group USC Informa1on Sciences Ins1tute HubZero A valuable platform for scientific researchers For

More information

Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales

Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales Margaret Lawson, Jay Lofstead Sandia National Laboratories is a multimission laboratory managed and operated

More information

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough Technical Coordinator London e-science Centre Imperial College London 17 th March 2006 Outline Where

More information

The Problem of Grid Scheduling

The Problem of Grid Scheduling Grid Scheduling The Problem of Grid Scheduling Decentralised ownership No one controls the grid Heterogeneous composition Difficult to guarantee execution environments Dynamic availability of resources

More information

Generating Annotations for How-to Videos Using Crowdsourcing

Generating Annotations for How-to Videos Using Crowdsourcing Generating Annotations for How-to Videos Using Crowdsourcing Phu Nguyen MIT CSAIL 32 Vassar St. Cambridge, MA 02139 phun@mit.edu Abstract How-to videos can be valuable teaching tools for users, but searching

More information

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course? Ioan Raicu More information at: http://www.cs.iit.edu/~iraicu/ Everyone else Background? What do you want to get out of this course? 2 Data Intensive Computing is critical to advancing modern science Applies

More information

Towards a Semantic Web Platform for Finite Element Simulations

Towards a Semantic Web Platform for Finite Element Simulations Towards a Semantic Web Platform for Finite Element Simulations André Freitas 1, Kartik Asooja 1, Swapnil Soni 1,2, Marggie Jones 1, Panagiotis Hasapis 3, Ratnesh Sahay 1 1 Insight Centre for Data Analytics,

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows Fourth IEEE International Conference on escience A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

A Granular Concurrency Control for Collaborative Scientific Workflow Composition

A Granular Concurrency Control for Collaborative Scientific Workflow Composition A Granular Concurrency Control for Collaborative Scientific Workflow Composition Xubo Fei, Shiyong Lu, Jia Zhang Department of Computer Science, Wayne State University, Detroit, MI, USA {xubo, shiyong}@wayne.edu

More information

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments

BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments Maria Luiza Mondelli 1, Thiago Magalhães 1, Guilherme Loss 1, Michael Wilde 2, Ian Foster 2, Marta Mattoso

More information

Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project

Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project Patrick Valduriez 1, Marta Mattoso 2, Reza Akbarinia 1, Heraldo Borges 3, José Camata 2, Alvaro Coutinho 2, Daniel

More information

RuMoR: Monitoring and Recovery of BPEL Applications

RuMoR: Monitoring and Recovery of BPEL Applications RuMoR: Monitoring and Recovery of BPEL Applications Jocelyn Simmonds, Shoham Ben-David, Marsha Chechik Department of Computer Science University of Toronto Toronto, ON M5S 3G4, Canada {jsimmond, shoham,

More information

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data

More information

Introduction to Extended Common Coupling with an Application Study on Linux

Introduction to Extended Common Coupling with an Application Study on Linux Introduction to Extended Common Coupling with an Application Study on Linux Liguo Yu Computer Science and Informatics Indiana University South Bend 1700 Mishawaka Ave. P.O. Box 7111 South Bend, IN 46634,

More information

{gledson, michael, yuri, jjjunior,

{gledson, michael, yuri, jjjunior, X-ARM: An Asset Representation Model for Component Repository Systems Glêdson Elias Michael Schuenck Yuri Negócio Jorge Dias Jr. Sindolfo Miranda Filho COMPOSE Component Oriented Service Engineering Group

More information

On the use of Abstract Workflows to Capture Scientific Process Provenance

On the use of Abstract Workflows to Capture Scientific Process Provenance On the use of Abstract Workflows to Capture Scientific Process Provenance Paulo Pinheiro da Silva, Leonardo Salayandia, Nicholas Del Rio, Ann Q. Gates The University of Texas at El Paso CENTER OF EXCELLENCE

More information

Width Inference Documentation

Width Inference Documentation Width Inference Documentation Bert Rodiers Ben Lickly Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2010-120 http://www.eecs.berkeley.edu/pubs/techrpts/2010/eecs-2010-120.html

More information

Scaling-Out with Oracle Grid Computing on Dell Hardware

Scaling-Out with Oracle Grid Computing on Dell Hardware Scaling-Out with Oracle Grid Computing on Dell Hardware A Dell White Paper J. Craig Lowery, Ph.D. Enterprise Solutions Engineering Dell Inc. August 2003 Increasing computing power by adding inexpensive

More information

Querying Provenance along with External Domain Data Using Prolog

Querying Provenance along with External Domain Data Using Prolog Querying Provenance along with External Domain Data Using Prolog Wellington Oliveira 1,2, Kary A. C. S. Ocaña 3, Daniel de Oliveira 1, and Vanessa Braganholo 1 1 Universidade Federal Fluminense, Brazil

More information

How to Exploit Abstract User Interfaces in MARIA

How to Exploit Abstract User Interfaces in MARIA How to Exploit Abstract User Interfaces in MARIA Fabio Paternò, Carmen Santoro, Lucio Davide Spano CNR-ISTI, HIIS Laboratory Via Moruzzi 1, 56124 Pisa, Italy {fabio.paterno, carmen.santoro, lucio.davide.spano}@isti.cnr.it

More information

The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms

The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms Marc Fisher II and Gregg Rothermel Department of Computer Science and Engineering

More information

Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers

Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers Michael Wilde, Ioan Raicu, Allan Espinosa, Zhao Zhang, Ben Clifford, Mihael Hategan, Kamil Iskra, Pete

More information

Automating Real-time Seismic Analysis

Automating Real-time Seismic Analysis Automating Real-time Seismic Analysis Through Streaming and High Throughput Workflows Rafael Ferreira da Silva, Ph.D. http://pegasus.isi.edu Do we need seismic analysis? Pegasus http://pegasus.isi.edu

More information

Confucius: A Tool Supporting Collaborative Scientific Workflow Composition

Confucius: A Tool Supporting Collaborative Scientific Workflow Composition IEEE TRANSACTIONS ON SERVICES COMPUTING, MANUSCRIPT ID 1 Confucius: A Tool Supporting Collaborative Scientific Workflow Composition Jia Zhang, Daniel Kuc, and Shiyong Lu Abstract Modern scientific data

More information

********************************************************************

******************************************************************** ******************************************************************** www.techfaq360.com SCWCD Mock Questions : J2EE DESIGN Pattern ******************************************************************** Question

More information

Software Engineering Design & Construction

Software Engineering Design & Construction Winter Semester 16/17 Software Engineering Design & Construction Dr. Michael Eichberg Fachgebiet Softwaretechnik Technische Universität Darmstadt Software Product Line Engineering based on slides created

More information

EUDAT- Towards a Global Collaborative Data Infrastructure

EUDAT- Towards a Global Collaborative Data Infrastructure EUDAT- Towards a Global Collaborative Data Infrastructure FOT-Net Data Stakeholder Meeting Brussels, 8 March 2016 Yann Le Franc, PhD e-science Data Factory, France CEO and founder EUDAT receives funding

More information

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia CUAHSI Virtual Workshop Field Data Management Solutions

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia   CUAHSI Virtual Workshop Field Data Management Solutions Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia email: sheldon@uga.edu CUAHSI Virtual Workshop Field Data Management Solutions 01-Oct-2014 Georgia Coastal Ecosystems LTER started in

More information

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity The Social Grid Leveraging the Power of the Web and Focusing on Development Simplicity Tony Hey Corporate Vice President of Technical Computing at Microsoft TCP/IP versus ISO Protocols ISO Committees disconnected

More information

Software as a Service Gateways

Software as a Service Gateways Gateways with Apache Airavata Software as a Service Gateways Eroma Abeysinghe - https://sgrc.iu.edu 04/17/2018 Software as a Service Gateways Groups with actively developing and updating codes/tools. Code

More information

Turning Data Science into a reality with TIBCO Spotfire

Turning Data Science into a reality with TIBCO Spotfire Turning Data Science into a reality with TIBCO Spotfire Eduardo Gonzalez-Couto, Ph.D. Product Manager, PerkinElmer Informatics Basel, 3 rd November 2016 Safe Harbor Statement This document shows current

More information

Storage in HPC: Scalable Scientific Data Management. Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11

Storage in HPC: Scalable Scientific Data Management. Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11 Storage in HPC: Scalable Scientific Data Management Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11 Who am I? Systems Research Lab (SRL), UC Santa Cruz LANL/UCSC Institute for Scalable Scientific

More information

ptop: A Process-level Power Profiling Tool

ptop: A Process-level Power Profiling Tool ptop: A Process-level Power Profiling Tool Thanh Do, Suhib Rawshdeh, and Weisong Shi Wayne State University {thanh, suhib, weisong}@wayne.edu ABSTRACT We solve the problem of estimating the amount of energy

More information

WGL A Workflow Generator Language and Utility

WGL A Workflow Generator Language and Utility WGL A Workflow Generator Language and Utility Technical Report Luiz Meyer, Marta Mattoso, Mike Wilde, Ian Foster Introduction Many scientific applications can be characterized as having sets of input and

More information

A Dynamic Memory Management Unit for Embedded Real-Time System-on-a-Chip

A Dynamic Memory Management Unit for Embedded Real-Time System-on-a-Chip A Dynamic Memory Management Unit for Embedded Real-Time System-on-a-Chip Mohamed Shalan Georgia Institute of Technology School of Electrical and Computer Engineering 801 Atlantic Drive Atlanta, GA 30332-0250

More information

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data

More information

Collaborative provenance for workflow-driven science and engineering Altintas, I.

Collaborative provenance for workflow-driven science and engineering Altintas, I. UvA-DARE (Digital Academic Repository) Collaborative provenance for workflow-driven science and engineering Altintas, I. Link to publication Citation for published version (APA): Altıntaş, İ. (2011). Collaborative

More information

Kestrel: An XMPP-Based Framework for Many Task Computing Applications

Kestrel: An XMPP-Based Framework for Many Task Computing Applications Kestrel: An XMPP-Based Framework for Many Task Computing Applications Lance Stout, Michael A. Murphy, and Sebastien Goasguen School of Computing Clemson University Clemson, SC 29634-0974 USA {lstout, mamurph,

More information

Opportunities of the rcuda remote GPU virtualization middleware. Federico Silla Universitat Politècnica de València Spain

Opportunities of the rcuda remote GPU virtualization middleware. Federico Silla Universitat Politècnica de València Spain Opportunities of the rcuda remote virtualization middleware Federico Silla Universitat Politècnica de València Spain st Outline What is rcuda? HPC Advisory Council China Conference 2017 2/45 s are the

More information

An Attempt to Identify Weakest and Strongest Queries

An Attempt to Identify Weakest and Strongest Queries An Attempt to Identify Weakest and Strongest Queries K. L. Kwok Queens College, City University of NY 65-30 Kissena Boulevard Flushing, NY 11367, USA kwok@ir.cs.qc.edu ABSTRACT We explore some term statistics

More information

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva International Journal "Information Technologies and Knowledge" Vol.2 / 2008 257 USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES Anna Malinova, Snezhana Gocheva-Ilieva Abstract:

More information

Techniques for Efficient Execution of Large-Scale Scientific Workflows in Distributed Environments

Techniques for Efficient Execution of Large-Scale Scientific Workflows in Distributed Environments Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 11-14-2014 Techniques for Efficient Execution of Large-Scale Scientific Workflows

More information

Headline in Arial Bold 30pt. Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008

Headline in Arial Bold 30pt. Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008 Headline in Arial Bold 30pt Visualisation using the Grid Jeff Adie Principal Systems Engineer, SAPK July 2008 Agenda Visualisation Today User Trends Technology Trends Grid Viz Nodes Software Ecosystem

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

Evolving FIRE into a 5G-Oriented Experimental Playground for Vertical Industries

Evolving FIRE into a 5G-Oriented Experimental Playground for Vertical Industries Evolving FIRE into a 5G-Oriented Experimental Playground for Vertical Industries Spyros Denazis University of Patras, Greece 5GinFIRE.eu contact@5ginfire.eu 5GinFIRE 5GINFIRE is a three years Research

More information

Architecture and Design of Customer Support System using Microsoft.NET technologies

Architecture and Design of Customer Support System using Microsoft.NET technologies Architecture and Design of Customer Support System using Microsoft.NET technologies Nikolay Pavlov PU Paisii Hilendarski 236 Bulgaria Blvd. Bulgaria, Plovdiv 4003 npavlov@kodar.net Asen Rahnev PU Paisii

More information

Specific Proposals for the Use of Petri Nets in a Concurrent Programming Course

Specific Proposals for the Use of Petri Nets in a Concurrent Programming Course Specific Proposals for the Use of Petri Nets in a Concurrent Programming Course João Paulo Barros Instituto Politécnico de Beja, Escola Superior de Tecnologia e Gestão Rua Afonso III, n.º 1 7800-050 Beja,

More information

Self-Managing Network-Attached Storage

Self-Managing Network-Attached Storage 1 of 5 5/24/2009 10:52 PM ACM Computing Surveys 28(4es), December 1996, http://www.acm.org/pubs/citations/journals/surveys/1996-28-4es/a209-gibson/. Copyright 1996 by the Association for Computing Machinery,

More information

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society.

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society. Join the Innovation Qatar Computing Research Institute (QCRI) is a national research institute established in 2010 by Qatar Foundation for Education, Science and Community Development. As a primary constituent

More information

Load Balancing and Data Migration in a Hybrid Computational Fluid Dynamics Application

Load Balancing and Data Migration in a Hybrid Computational Fluid Dynamics Application Load Balancing and Data Migration in a Hybrid Computational Fluid Dynamics Application Esteban Meneses Patrick Pisciuneri Center for Simulation and Modeling (SaM) University of Pittsburgh University of

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

Virtualization of Workflows for Data Intensive Computation

Virtualization of Workflows for Data Intensive Computation Virtualization of Workflows for Data Intensive Computation Sreekanth Pothanis (1,2), Arcot Rajasekar (3,4), Reagan Moore (3,4). 1 Center for Computation and Technology, Louisiana State University, Baton

More information

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development Jeremy Fischer Indiana University 9 September 2014 Citation: Fischer, J.L. 2014. ACCI Recommendations on Long Term

More information

European Open Science Cloud

European Open Science Cloud European Open Science Cloud a common vision for accessing services for science and research. eage 04/12/2017 Enrique GOMEZ Programme Officcer e-infrastructure and Science Cloud EC DG CONNECT/C1 1 The EOSC

More information

An Open System Framework for component-based CNC Machines

An Open System Framework for component-based CNC Machines An Open System Framework for component-based CNC Machines John Michaloski National Institute of Standards and Technology Sushil Birla and C. Jerry Yen General Motors Richard Igou Y12 and Oak Ridge National

More information

Applying Microservices in Webservices, with An Implementation Idea

Applying Microservices in Webservices, with An Implementation Idea International Conference on Computer Applications 64 International Conference on Computer Applications 2016 [ICCA 2016] ISBN 978-81-929866-5-4 VOL 05 Website icca.co.in email icca@asdf.res.in Received

More information

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute Pegasus Workflow Management System Gideon Juve USC Informa3on Sciences Ins3tute Scientific Workflows Orchestrate complex, multi-stage scientific computations Often expressed as directed acyclic graphs

More information