Opal: Simple Web Services Wrappers for Scientific Applications

Similar documents
Opal: Wrapping Scientific Applications as Web Services

Sriram Krishnan

An End-to-End Web Services-based Infrastructure for Biomedical Applications

The Opal Toolkit. Wrapping Scientific Applications as Web Services

GSI-based Security for Web Services

Providing Dynamic Virtualized Access to Grid Resources via the Web 2.0 Paradigm

An Eclipse-based Environment for Programming and Using Service-Oriented Grid

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Sriram Krishnan, Ph.D. NBCR Summer Institute, August 2010

By Ian Foster. Zhifeng Yun

Research and Design Application Platform of Service Grid Based on WSRF

GAMA: Grid Account Management Architecture

Globus Toolkit 4 Execution Management. Alexandra Jimborean International School of Informatics Hagenberg, 2009

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

Extending Grid Protocols onto the Desktop using the Mozilla Framework

Globus GTK and Grid Services

Design The way components fit together

Introduction to Grid Computing

Java Development and Grid Computing with the Globus Toolkit Version 3

Rapid Deployment of VS Workflows. Meta Scheduling Service

Grid Middleware and Globus Toolkit Architecture

Cloud Computing. Up until now

Grid Services and the Globus Toolkit

Lisa Banks Distributed Systems Subcommittee

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Task Management Service

GRAIL Grid Access and Instrumentation Tool

WSRF Services for Composing Distributed Data Mining Applications on Grids: Functionality and Performance

Building Services in WSRF. Ben Clifford GGF Summer School July 2004

Grid Computing. Lectured by: Dr. Pham Tran Vu Faculty of Computer and Engineering HCMC University of Technology

Grid-Based PDE.Mart: A PDE-Oriented PSE for Grid Computing

Regular Forum of Lreis. Speechmaker: Gao Ang

Introduction. Software Trends. Topics for Discussion. Grid Technology. GridForce:

UNIT IV PROGRAMMING MODEL. Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus

glite Grid Services Overview

MSF: A Workflow Service Infrastructure for Computational Grid Environments

CSF4:A WSRF Compliant Meta-Scheduler

PythonCLServiceTool: A Utility for Wrapping Command-Line Applications for The Grid David E. Konerding and Keith R. Jackson

SZDG, ecom4com technology, EDGeS-EDGI in large P. Kacsuk MTA SZTAKI

Grid Scheduling Architectures with Globus

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

: ESB Implementation Profile

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

Classroom Exercises for Grid Services

Data Access and Analysis with Distributed, Federated Data Servers in climateprediction.net

Building Cyberinfrastructure for Bioinformatics Using Service Oriented Architecture

Web Services Development for IBM WebSphere Application Server V7.0

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough

GT-OGSA Grid Service Infrastructure

Using Resources of Multiple Grids with the Grid Service Provider. Micha?Kosiedowski

From Web Services Toward Grid Services

X100 ARCHITECTURE REFERENCES:

Introduction of PDE.Mart

Installation and Administration

Weka4WS: a WSRF-enabled Weka Toolkit for Distributed Data Mining on Grids

S.No QUESTIONS COMPETENCE LEVEL UNIT -1 PART A 1. Illustrate the evolutionary trend towards parallel distributed and cloud computing.

Independent Software Vendors (ISV) Remote Computing Usage Primer

TPF Users Group Fall 2007

PythonCLServiceTool: A Utility for Wrapping Command-Line Applications for The Grid David E. Konerding and Keith R. Jackson

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

ECLIPSE PERSISTENCE PLATFORM (ECLIPSELINK) FAQ

Customized way of Resource Discovery in a Campus Grid

ICENI: An Open Grid Service Architecture Implemented with Jini Nathalie Furmento, William Lee, Anthony Mayer, Steven Newhouse, and John Darlington

ActiveWorkflow Overview

Agent-Enabling Transformation of E-Commerce Portals with Web Services

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

Database Assessment for PDMS

Developing and Deploying vsphere Solutions, vservices, and ESX Agents. 17 APR 2018 vsphere Web Services SDK 6.7 vcenter Server 6.7 VMware ESXi 6.

IBM Rational Application Developer for WebSphere Software, Version 7.0

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Oracle Fusion Middleware

Goal: Offer practical information to help the architecture evaluation of an SOA system. Evaluating a Service-Oriented Architecture

Incorporation of Middleware and Grid Technologies to Enhance Usability in Computational Chemistry Applications

Tutorial 1: Introduction to Globus Toolkit. John Watt, National e-science Centre

CS6703 GRID AND CLOUD COMPUTING. Question Bank Unit-I. Introduction

UGP and the UC Grid Portals

Grid Compute Resources and Job Management

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

University At Buffalo COURSE OUTLINE. A. Course Title: CSE 487/587 Information Structures

Everest: A Cloud Platform for Computational Web Services

Grid Computing: Status and Perspectives. Alexander Reinefeld Florian Schintke. Outline MOTIVATION TWO TYPICAL APPLICATION DOMAINS

Deploying virtualisation in a production grid

Computational Web Portals. Tomasz Haupt Mississippi State University

Sentinet for BizTalk Server SENTINET

Introduce Grid Service Authoring Toolkit

How to build Scientific Gateways with Vine Toolkit and Liferay/GridSphere framework

Tools to Develop New Linux Applications

AIM Enterprise Platform Software IBM z/transaction Processing Facility Enterprise Edition 1.1.0

Developing Web Services. with Axis. Web Languages Course 2009 University of Trento

GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide

Component-based Grid Programming Using the HOC-Service Architecture

What's New in ActiveVOS 7.1 Includes ActiveVOS 7.1.1

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library

Leverage SOA for increased business flexibility What, why, how, and when

High Throughput WAN Data Transfer with Hadoop-based Storage

XSEDE Software and Services Table For Service Providers and Campus Bridging

On the Creation of Distributed Simulation Web- Services in CD++

A SEMANTIC MATCHMAKER SERVICE ON THE GRID

On Using BPEL Extensibility to Implement OGSI and WSRF Grid Workflows

What is it? What does it do?

Transcription:

Opal: Simple Web Services Wrappers for Scientific Applications Sriram Krishnan*, Brent Stearn, Karan Bhatia, Kim K. Baldridge, Wilfred W. Li, Peter Arzberger *sriram@sdsc.edu ICWS 2006 - Sept 21, 2006

NBCR Computational Infrastructure Set of Biomedical Applications QMView GAMESS APBS Autodock Rich Clients PMV ADT APBSCommand Vision Resources Continuity Gtomo2 TxBR Web Portals Continuity Telescience Portal Computational Grid Web Services Workflow Middleware

Requirements Enable access to scientific applications on Grid resources Seamlessly via a number of user interfaces Easily from the perspective of a scientific user Enable the creation of scientific workflows Possibly with the use of commodity workflow toolkits

Challenges for the Scientific User Access to Grid resources is still very complicated User account creation Management of credentials Installation and deployment of scientific software Interaction with Grid schedulers Data management

Towards Services Oriented Architectures (SOA) Scientific applications wrapped as Web services Provision of a SOAP API for programmatic access Clients interact with application Web services, instead of Grid resources Used in practice by several scientific communities NBCR: http://nbcr.net GEON: http://geongrid.org GLEON: http://gleon.org/ CAMERA: http://camera.calit2.net

Talk Outline Motivation for a Services Oriented Architecture (SOA) Overall end-to-end architecture The Opal Toolkit: Introduction & technical details Some use cases Concluding remarks

Architecture Overview Gemstone PMV/Vision Kepler State Mgmt Application Services Security Services (GAMA) Globus Globus Globus Condor pool SGE Cluster PBS Cluster

Scientific SOA: Benefits Applications are installed once, and used by all authorized users No need to create accounts for all Grid users Use of standards-based Grid security mechanisms Users are shielded from the complexities of Grid schedulers Data management for multiple concurrent job runs performed automatically by the Web service State management and persistence for long running jobs Accessibility via a multitude of clients

Possible Approaches Write application services by hand Pros: More flexible implementations, stronger data typing via custom XML schemas Cons: Not generic, need to write one wrapper per application Use a Web services wrapper toolkit, such as Opal Pros: Generic, rapid deployment of new services Cons: Less flexible implementation, weak data typing due to use of generic XML schemas

The Opal Toolkit: Overview Enables rapid deployment of scientific applications as Web services (< 2 hours) Steps Application writers create configuration file(s) for a scientific application Deploy the application as a Web service using Opal s simple deployment mechanism (via Apache Ant) Users can now access this application as a Web service via a unique URL

Tomcat Container Axis Engine Opal Architecture Container Properties Scheduler, Security, Database Setups Opal WS Opal WS Service Config Binary, Metadata, Arguments Cluster/Grid Resources

Implementation Details Implemented as a regular Axis service Application behavior specified by the service configuration Configuration passed as a parameter via the autogenerated deployment descriptor (WSDD) Possible to have multiple instances of the same class for different applications Distinguished by a unique URL for every application No need to generate sources or WSDL prior to deployment

Sample Container Properties # the base URL for the tomcat installation # this is required since Java can't figure out the IP # address if there are multiple network interfaces tomcat.url=http://ws.nbcr.net:8080 # database information database.use=false database.url=jdbc:postgresql://localhost/app_db database.user=<app_user> database.passwd=<app_passwd> # globus information globus.use=true globus.gatekeeper=ws.nbcr.net:2119/jobmanager-sge globus.service_cert=/home/apbs_user/certs/apbs_service.cert.pem globus.service_privkey=/home/apbs_user/certs/apbs_service.privkey # parallel parameters num.procs=16 mpi.run=/opt/mpich/gnu/bin/mpirun

Sample Application Configuration <appconfig xmlns="http://nbcr.sdsc.edu/opal/types" xmlns:xsd="http://www.w3.org/2001/xmlschema"> <metadata> <usage><![cdata[psize.py [opts] <filename>]]></usage> <info xsd:type="xsd:string"> <![CDATA[ --help --CFAC=<value> : Display this text : Factor by which to expand mol dims to get coarse grid dims [default = 1.7]... ]]> </info> </metadata> <binarylocation>/homes/apbs_user/bin/psize.py</binarylocation> <defaultargs>--gmemceil=1000</defaultargs> <parallel>false</parallel> </appconfig>

Application Deployment & Undeployment To deploy onto a local Tomcat container: ant -f build-opal.xml deploy -DserviceName=<serviceName> -DappConfig=<appConfig.xml> To undeploy a service: ant -f build-opal.xml undeploy -DserviceName=<serviceName>

Service Operations Get application metadata: Returns metadata specified inside the application configuration Launch job: Accepts list of arguments and input files (Base64 encoded), launches the job, and returns a jobid Query job status: Returns status of running job using the jobid Get job outputs: Returns the locations of job outputs using the jobid Get output as Base64: Returns an output file in Base64 encoded form Destroy job: Uses the jobid to destroy a running job

MEME+MAST Workflow using Kepler

Kepler Opal Web Services Actor

Ligand Protein Interaction Using Web Services Baldridge, Greenberg, Amoreira, Kondric GAMESS Service More accurate Ligand Information LigPrep Service Generation of Conformational Spaces PDB2PQR Service Protein preparation APBS Service Generation of electrostatic information QMView Service Visualization of electrostatic potential file Applications: Electrostatics and docking High-throughput processing of ligandprotein interaction studies Use of small molecules (ligands) to turn on or off a protein function

Conclusions We presented Opal, a toolkit for rapidly exposing legacy scientific applications as Web services Provides features like Job management, Scheduling, Security, and Persistence, in an easy to use and configurable manner Lowers inertia in the scientific community against the adoption of Web services and SOAs Currently being used by the NBCR community for Grid-enabling several scientific applications, via a multitude of interfaces

WSRF Integration Future Work State management using standard Grid mechanisms Asynchronous status notifications via WS-Notification Meta-scheduling and resource monitoring Use of CSF4 and GFarm Alternate mechanisms for I/O staging GridFTP, RFT Addition of strong data typing for I/O Use of XML schemas, DFDL

Thanks for listening! More information, downloads, documentation: http://nbcr.net/services/ Questions and comments welcome! sriram@sdsc.edu

Appendix

Grid Computing is Co-ordinated resource sharing and problem solving in dynamic multi-institutional virtual organizations. [Foster, Kesselman, Tuecke] Co-ordinated - multiple resources working in concert, eg. Disk & CPU, or instruments & database, etc. Resources - compute cycles, databases, files, application services, instruments. Problem solving - focus on solving scientific problems Dynamic - environments that are changing in unpredictable ways Virtual Organization - resources spanning multiple organizations and administrative domains, security domains, and technical domains

Grid Computing is (Industry) About finding distributed, underutilized compute resources (systems, desktops, storage) and provisioning those resources to users or applications requiring them. [The Grid Report, Clabby Analytics] Distributed - all the resources laying around in departments or server rooms. Underutilized - organizations save money by increasing utilization versus purchasing new resources. Resources - servers and server cycles, applications, data resources Provisioning - predict and schedule resource use depending on load.

Scientific Objective: Modeling and Analysis Across Scales Organisms Organ Modeling the Heart Modeling Synaptic Activity Tissue Cell ventricles Subcellular Macromolecular multicellular lattice Molecule Atom crossbridge Tools that Integrate Data, Construct Models and Perform Analysis across Scales filament