Kepler and Grid Systems -- Early Efforts --

Similar documents
A High-Level Distributed Execution Framework for Scientific Workflows

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A.

Workflow Fault Tolerance for Kepler. Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher

High Performance Computing Course Notes Grid Computing I

Integrated Machine Learning in the Kepler Scientific Workflow System

Hands-on tutorial on usage the Kepler Scientific Workflow System

Data publication and discovery with Globus

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A High-Level Distributed Execution Framework for Scientific Workflows

KEPLER: Overview and Project Status

SwinDeW-G (Swinburne Decentralised Workflow for Grid) System Architecture. Authors: SwinDeW-G Team Contact: {yyang,

Introduction to Grid Computing

Managing Grid Credentials

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Knowledge Discovery Services and Tools on Grids

A Federated Grid Environment with Replication Services

Index Introduction Setting up an account Searching and accessing Download Advanced features

EPCC Sun Data and Compute Grids Project Update

Exploiting peer group concept for adaptive and highly available services

Project JXTA Technology Overview

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

The Materials Data Facility

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Grid Architectural Models

Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008

GT-OGSA Grid Service Infrastructure

Distributed Data Management with Storage Resource Broker in the UK

UGP and the UC Grid Portals

Project JXTA Technology Overview.

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Computer Information Systems

What s Out There and Where Do I find it: Enterprise Metacard Builder Resource Portal

OGCE User Guide for OGCE Release 1

SimWORKS, A Hybrid Java/C++ Simulation Platform

Kepler Scientific Workflow and Climate Modeling

UCLA Grid Portal (UGP) A Globus Incubator Project

REACH-IT Stakeholder Workshop. REACH-IT Architecture

globus online Globus Nexus Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory

SLATE. Services Layer at the Edge. First Meeting of the National Research Platform Montana State University August 7-8, 2017

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

Where we are so far. Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Design patterns for data-driven research acceleration

VMWARE PROTECTION WITH DELL EMC NETWORKER 9

COURSE OUTLINE: A Advanced Technologies of SharePoint 2016

STATUS UPDATE ON THE INTEGRATION OF SEE-GRID INTO G- SDAM AND FURTHER IMPLEMENTATION SPECIFIC TOPICS

ACET s e-research Activities

UNIT IV PROGRAMMING MODEL. Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus

USER MANUAL FOR AUTHOR.

Certificate reputation. Dorottya Papp

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

Scientific Workflows

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

The time is right for P2P and Project JXTA

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Mitigating Risk of Data Loss in Preservation Environments

Accelerating Parameter Sweep Workflows by Utilizing Ad-hoc Network Computing Resources: an Ecological Example

Knowledge-based Grids

I data set della ricerca ed il progetto EUDAT

Globus Toolkit Firewall Requirements. Abstract

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

Data Staging and Data Movement with EUDAT. Course Introduction Helsinki 10 th -12 th September, Course Timetable TODAY

Carelyn Campbell, Ben Blaiszik, Laura Bartolo. November 1, 2016

EMC VPLEX Geo with Quantum StorNext

Eclipse Technology Project: g-eclipse

DALA Project: Digital Archive System for Long Term Access

A Resource Discovery Algorithm in Mobile Grid Computing Based on IP-Paging Scheme

Core Technology Development Team Meeting

Workflow Exchange and Archival: The KSW File and the Kepler Object Manager. Shawn Bowers (For Chad Berkley & Matt Jones)

How to Navigate the Challenge Runner Website

Working with Scientific Data in ArcGIS Platform

Grid Infrastructure For Collaborative High Performance Scientific Computing

Data Discovery - Introduction

Distributed Repository for Biomedical Applications

Adobe Experience Manager 6 Business Practitioner Adobe Certified Expert Exam Guide

Web Services Based Instrument Monitoring and Control

The National Fusion Collaboratory

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS

THEBES: THE GRID MIDDLEWARE PROJECT Project Overview, Status Report and Roadmap

The Opal Toolkit. Wrapping Scientific Applications as Web Services

Q&A. DEMO Version

Triple store databases and their role in high throughput, automated extensible data analysis

Grid-Based Data Mining and the KNOWLEDGE GRID Framework

Maintaining a Microsoft SQL Server 2005 Database Course 2780: Three days; Instructor-Led

Kepler: An Extensible System for Design and Execution of Scientific Workflows

Data Management Glossary

Approaches to Distributed Execution of Scientific Workflows in Kepler

DSpace Fedora. Eprints Greenstone. Handle System

Big Data Applications using Workflows for Data Parallel Computing

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example

2/12/11. Addendum (different syntax, similar ideas): XML, JSON, Motivation: Why Scientific Workflows? Scientific Workflows

X-AFFILIATE module for X-Cart 4.0.x

Nimrod/K: Towards Massively Parallel Dynamic Grid Workflows

A Simple Mass Storage System for the SRB Data Grid

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

Grid Middleware and Globus Toolkit Architecture

Managing Exploratory Workflows

Transcription:

Distributed Computing in Kepler Lead, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, (Joint work with Matthew Jones) 6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computation is a Requirement in Scientific Computing Scientific workflows do scientific computing! Increasing need for data and compute capabilities Data and computation should be combined for success! z HEC + Data management/integration Picture from: Fran BERMAN z1

Kepler and Grid Systems -- Early Efforts -- Some Grid actors in place Globus Job Runner, GridFTP-based file access, Proxy Certificate Generator For one job execution! Can be iterated SRB support Interaction with Nimrod and APST Grid workflow pattern: STAGE FILES -> EXECUTE -> FETCH FILES Execute==Schedule -> Monitor & Recover Issues: Data and process provenance, user interaction, reporting and logging NIMROD and APST GOAL: To use the expertise in scheduling and job maintenance

Distributed Computing is Team Work Login to, create, join Grids & role-based access Access data Execute services Discover & use existing workflows Design, share, annotate, run and register workflows So our distributed computing framework should support collaborations! as well as it should keep control of scheduling and provenance information Goals and Requirements Two targets: Distributing execution Users can configure Kepler Grid access and execution parameters Kepler should manage the orchestration of distributed nodes. Kepler will have the ability to do failure recovery Users can be able to detach from the workflow instance after they and then connect again Supporting on the fly online collaborations Users can log into Kepler Grid and form groups Users can specify who can share the execution

Peer-to-Peer System Satisfies These Goals A peer-to-peer network: Many or all of the participating hosts act both as client and server in the communication The JXTA framework provides: Peers Peer Groups Pipes Messages Queries and responses for metadata Requests and responses to move workflows and workflow components as.ksw files Data flow messages in executing workflows Creating KeplerGrid using P2P Technology Setting up Grid parameters Register as a peer Configure Configure

Creating KeplerGrid using P2P Technology Creating, Joining & Leaving Grids Creating KeplerGrid using P2P Technology Distributing Computation on a Specific Grid P2P/JXTA Director Decides on the overall execution schedule Communicates with different nodes (peers) in the Grid Submits distributable jobs to remote nodes Can deduce if an actor can run remotely from its metadata Configuration parameters: Group to join Can have multiple models: Using a master peer and static scheduling is the current focus Work in progress

Creating KeplerGrid using P2P Technology Provenance, Execution Logs and Failure Recovery Built in services for handling failures and resubmission Checkpointing Store data where you execute it & send back metadata The master peer collects the provenance information How can we do it without having a global job database? Work in progress Status of Design and Implementation Initial tests with Grid creation, peer registration and discovery Start with a basic execution model extending SDF Need to explore different execution models More dynamic models seem more suitable Big design decisions to think on: What to stage to remote nodes Scalability Detachability Certification and security

To sum up Just distributing the execution is not enough Need to think about the usability of it! Need to have sub-services using the JXTA model for peer discovery, data communication, logging, failure recovery. Might need more than one domain for different types of distributed workflows Questions?.. Thanks! altintas@sdsc.edu +1 (858) 822-5453 http://www.sdsc.edu